> That such big performance increases can be extracted from relatively simple changes like rounding numbers or switching programming languages might seem surprising. But it reflects the breakneck speed with which llms have been developed. For many years they were research projects, and simply getting them to work well was more important than making them elegant. Only recently have they graduated to commercial, mass-market products. Most experts think there remains plenty of room for improvement. As Chris Manning, a computer scientist at Stanford University, put it: “There’s absolutely no reason to believe…that this is the ultimate neural architecture, and we will never find anything better.”
We have so many a-ha moments ahead of us in this field. Seemingly minor changes yielding task speed multipliers, fresh eyes on foundational codebases saying now why the heck did they do it that way when xyz exists and works better, etc. A recent graphics driver update took my local SD performance from almost 4 seconds per iteration to 2.7it/s because someone somewhere had an a-ha moment. We're practically in the Commodore 64 era of this technology and there are only going to be more and more people putting their eyes and minds on these problems every day.
> That such big performance increases can be extracted from relatively simple changes like rounding numbers or switching programming languages might seem surprising. But it reflects the breakneck speed with which llms have been developed. For many years they were research projects, and simply getting them to work well was more important than making them elegant. Only recently have they graduated to commercial, mass-market products. Most experts think there remains plenty of room for improvement. As Chris Manning, a computer scientist at Stanford University, put it: “There’s absolutely no reason to believe…that this is the ultimate neural architecture, and we will never find anything better.”