> Data quality like you're describing just doesn't matter. GPT is trained on PEB...

> Data quality like you're describing just doesn't matter. GPT is trained on PEBIBYTES of data.

It matters a little bit, in a quantitative but not qualitative way. Probably with good data cleaning you could get as high quality result with only one pebibyte of data if it normally needs two pebibytes. If training time is proportional to dataset size then maybe it takes three months instead of six months to train. Maybe it would save hundreds of millions or a billion dollars which I guess would matter to someone. It probably wouldn't matter qualitatively though.