What is this and why does it take the top two spots on HN?

judge2020 · on May 29, 2020

One thread will (probably) be merged into the other, but GPT-2 was an extremely popular OpenAI project that generated long, realistic-sounding text/articles if you gave it a simple starting sentence or topic sentence. GPT-3 is an iteration on that, so it's likely a huge improvement.

anoncareer0212 · on May 29, 2020

It doesn't sound like it's an improvement at all, but instead requires less training data to produce worse results?

freeone3000 · on May 29, 2020

MUCH less training for SLIGHTLY worse results. It's a huge benefit to be able to make this trade-off.

drusepth · on May 29, 2020

Is the reverse also true? If you have the training data necessary for "good" results on GPT-2, is it generally correct to assume that it would provide better results on your task than GPT-3?

freeone3000 · on May 30, 2020

If you can answer this question without running both models over the data set, you've got a very good paper on your hands.

Reelin · on May 29, 2020

This is a massive improvement to the extent that previously you had to retrain (ie update) the stock model on a specialized dataset to get good results for a particular task.

leesec · on May 29, 2020

GPT-2 was a groundbreaking advancement in NLP, this is an iteration on that. A general purpose language model that can answer questions, write full (mostly) human indistinguishable articles, do some translation, etc...