Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, I personally feel that the "official" benchmarks are increasingly diverging from the everyday reality of using these models. My theory is that we are reaching a point where all the models are intelligent enough for day-to-day queries, so points like style/personality and proper use of web queries and other capabilities are better differentiators than intelligence alone.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: