Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What is this and why does it take the top two spots on HN?


One thread will (probably) be merged into the other, but GPT-2 was an extremely popular OpenAI project that generated long, realistic-sounding text/articles if you gave it a simple starting sentence or topic sentence. GPT-3 is an iteration on that, so it's likely a huge improvement.


It doesn't sound like it's an improvement at all, but instead requires less training data to produce worse results?


MUCH less training for SLIGHTLY worse results. It's a huge benefit to be able to make this trade-off.


Is the reverse also true? If you have the training data necessary for "good" results on GPT-2, is it generally correct to assume that it would provide better results on your task than GPT-3?


If you can answer this question without running both models over the data set, you've got a very good paper on your hands.


This is a massive improvement to the extent that previously you had to retrain (ie update) the stock model on a specialized dataset to get good results for a particular task.


GPT-2 was a groundbreaking advancement in NLP, this is an iteration on that. A general purpose language model that can answer questions, write full (mostly) human indistinguishable articles, do some translation, etc...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: