Instead there are two options. Taking the user input and putting it in the training corpus and reweighting the neural net. Or, using the user input as up/down votes on the RLHF to alter the output of the weights that already exist.
Depending on what “it” is, it does through in-context learning, though that’s, obviously, limited to the context window.
Instead there are two options. Taking the user input and putting it in the training corpus and reweighting the neural net. Or, using the user input as up/down votes on the RLHF to alter the output of the weights that already exist.