Hacker Newsnew | past | comments | ask | show | jobs | submitlogin



These are not the uncensored models. It's a fine tune of the censored models [1].

> Filter refusals and bias from the dataset -> finetune the model -> release.

The alignment tax should still exist, maybe doubly so.

[1] https://erichartford.com/uncensored-models#heading-lets-get-...


The base model is uncensored. From a glance at your link, that is about uncensoring the conversational training set.


I don't think so: https://erichartford.com/uncensored-models#heading-whats-an-...

> Most of these models (for example, Alpaca, Vicuna, WizardLM, MPT-7B-Chat, Wizard-Vicuna, GPT4-X-Vicuna) have some sort of embedded alignment

> The reason these models are aligned is that they are trained with data that was generated by ChatGPT, which itself is aligned by an alignment team at OpenAI.


Most of those are fine-tunes of the base model. The fine-tuning data is 'aligned'. The uncensored fine-tune training data is edited to remove the "I can't help you with that" responses.


Yes, as stated in my earlier comment: there's an alignment tax, and then almost certainly an un-alignment tax, on top of that, compared to the raw, unaligned/uncensored, models.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: