Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If I wanted to use this software to finetune a 70b model on two 3090s to write fiction, what is the maximum sequence length that would be practical? I'm at the dataset collection stage, but I'm not sure whether to aim for bigger or smaller sequence lengths at the moment.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: