Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is going to happen alot over the next few years. One can fine tune GPT-2 medium on an RTX2070. Training GPT-2 medium from scratch can be done for $162 on vast.ai. The newer H100/Trainium/Tensorcore chips will bring the price down even further.

I suspect if one wanted to fully replicate ChatGPT from scratch it would take ~1-2 million including label acquisition. You probably only require ~200-500k in compute.

The next few years are going to be wild!



These things have reached the tipping point where they provide significant utility to a significant portion of the computer scientists working on making these things. Could be that the coming iterations of these new tools will make it increasingly easy to write the code for the next iterations of these tools.

I wonder if this is the first rumblings of the singularity.


I can imagine a world where there are an infinity of “local maximums” that stop a system from reaching a singular feedback loop… imagine if our current tools help write the next generation, so on, so on, until it gets stuck in some local optimization somewhere. Getting stuck seems more likely than not getting stuck, right?


chatGPT being able to write OpenAI API code is great, and all companies should prepare samples so future models can correctly interface with their systems.

But what will be needed is to create an AI that implements scientific papers. About 30% of papers have code implementation. That's a sizeable dataset to train a Codex model on.

You can have AI generating papers, and AI implementing papers, then learning to predict experimental results. This is how you bootstrap a self improving AI.

It does not learn only how to recreate itself, it learns how to solve all problems at the same time. A data engineering approach to AI: search and learn / solve and learn / evolve and learn.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: