I wish people would get off the "AI is the worst thing for the environment" bandwagon. AI and data centers as a whole aren't even in the top 100 emitters of pollution and never will be.
If you want to complain about tech companies ruining the environment, look towards policies that force people to come into the office. Pointless commutes are far, far worse for the environment than all data centers combined.
Complaining about the environmental impact of AI is like plastic manufacturers putting recycling labels on plastic that is inherently not recycleable and making it seem like plastic pollution is every day people's fault for not recycling enough.
AI's impact on the environment is so tiny it's comparable to a rounding error when held up against the output of say, global shipping or air travel.
Why don't people get this upset at airport expansions? They're vastly worse.
The answer to that is simple: They hate AI and the environment angle is just an excuse, much like their concern over AI art. Human psychology is such that many of these people actually believe the excuse too.
It helps when you put yourself in the shoes of people like that and ask yourself, if I find out tomorrow that the evidence that AI is actually good for the environment is stronger, will I believe it? Will it even matter for my opposition to AI? The answer is no.
You don't know that. I don't know about you (and whatever you wrote possibly tells more about yourself than anyone else), but I prefer my positions strong and based on reality, not based on lies (to myself included).
And the environment is far from being the only concern.
You are attacking a straw man. For you, being against GenAI, simply because it happens to be against your beliefs, is necessarily irrational. Please don't do this.
> I prefer my positions strong and based on reality, not based on lies (to myself included).
Then you would be the exception, not the rule.
And if you find yourself attached to any ideology, then you are also wrong about yourself. Subscribing to any ideology is by definition lying to yourself.
Being able to place yourself into the shoes of others is something evolution spent 1000s of generations hardwiring into us, I'm very confident in my reading of the situation.
> Having beliefs, principles or values is not lying to oneself.
The lie is that you adopted "beliefs, principles or values" which cannot ever serve your interests, you have subsumed yourself into something that cannot ever reciprocate. Ideology by definition even alters your perceived interests, a more potent subversion cannot be had (up to now, with potential involuntary neural interfaces on the horizon).
> Citation needed
I will not be providing one, but that you believe one is required is telling. There is no further point to this discussion.
People are allowed to reject whatever they want, I'm sorry that democracy is failing you to make slightly more money while the rest of society suffers.
I'm glad people are grabbing the reigns of power back from some of the most evil people on the planet.
Of course they aren't polluters as in generating some kind of smoke themselves. But they do consume megawatts upon megawatts of power that has to be generated somewhere. Not often you have the luxury of building near nuclear power plant. And in the end you're still releasing those megawatts as heat into the atmosphere.
The way the current architecture works—as far as I know—is your assumed "server caches the generated output" step doesn't exist. What you get in your output is streamed directly from the LLM to your client. Which is, in theory, the most efficient way to do it.
That's why LLM outputs that get cut off mid-stream require the end user click the "retry" button and not the, "re-send me that last output" button (which doesn't exist).
I would imagine that a simpler approach would be to simply make the last prompt idempotent... Which would require caching on their servers; something that supposedly isn't happening right now. That way, if the user re-sends the last prompt the server just responds with the same exact output it just generated. Except LLMs often make mistakes and hallucinate things... So re-sending the last prompt and hoping for a better output isn't an uncommon thing.
Soooo... Back to my suggested workaround in my other comment: Pub/sub over WebSockets :D
The user's last prompt can be sent with an idempotency key that changes each time the user initiates a new request. If that's the same, use the cache. If it's new, hit the LLM again.
The only reason LLM server responds with partial results instead of waiting and returning all at once is UX. It’s just too slow. But the problem of slow bulk responses isn’t unique for LLM and can be solved within HTTP 1.1 well enough. Doesn’t have to be the same server, can be a caching proxy in front of it. Any privacy concerns can be addressed by giving the user opportunity to tell server to cache/not to cache (can be as easy as submitting with PUT vs POST requests)
Pub/sub via WebSockets seems like the simplest solution. You'll need to change your LLM serving architecture around a little bit to use a pub/sub system that a microservice can grab the output from (to send to the client) but it's not rocket science.
It's yet another system that needs some DRAM though. The good news is that you can auto-expire the queued up responses pretty fast :shrug:
No idea if it's worth it, though. Someone with access to the statistics surrounding dropped connections/repeated prompts at a big LLM service provider would need to do some math.
I think it would be even more wasteful to continue inference in background for nothing if the user decided to leave without pressing the stop button. Saving the partial answer at the exact moment the client disappeared would be better.
Perhaps we can call this type of maneuver, "The Sam Altman": Your expensive business's mid-term outlook not looking so good? Why not use all that cash/credit to corner the market in some commodity in order to cripple your perceived competition?
He's not the first one though. The crypto miners used to do the same (I distinctly 'member first GPUs, then HDDs, then ordinary RAM being squoze by yet another new shitcoin in less than a year), and Uber plus the food delivery apps are a masterclass in how to destroy competition with seemingly infinite cash.
ebooks as a platform will never evolve until ereaders (like these) get ~30FPS refresh rates. That's when "scrollytelling" can enter the race and could very well expand the industry into new territory.
The previous Kindle Scribe had a slow refresh rate, and it showed every time you tried to turn a page. All I want so far as refresh rates are concerned is seamless page-turning – page-turning that doesn’t make me wait. Will this version of the Scribe be any better? The Wired review doesn’t say.
It's close --- used to be I would start the page turn when on the next-to-last line on the page, but more recent Kindles are fast enough that I don't bother, and it doesn't feel _that_ much slower than turning a physical page.
I remember the early days of the ipad 1 where publishers and technologists were stoked about all the cool new interactive things they could do with this format.
It flopped. It turns out interactive infographics and scrollytelling are fun (and costly) to make but readers don't really like them.
The smashing success story wasn't actually what you can do with the new devices' screen, it was audio. It turns out audiobooks (and podcasts) are a huge hit when the price is right and you make it accessible enough.
“scrollytelling”? Scrolling? Or tap to slideshow, which doesn’t require scrolling? Or some novel format that uses scrolling as a gesture to “advance”? Wouldn’t that have taken off somewhere other than overwrought marketing pages on Apple.com? Is it different than tapping?
What do you imagine would use that? I can only think of smooth scrolling on a web toon or something, but you would want much better color reproduction first.
If you want to complain about tech companies ruining the environment, look towards policies that force people to come into the office. Pointless commutes are far, far worse for the environment than all data centers combined.
Complaining about the environmental impact of AI is like plastic manufacturers putting recycling labels on plastic that is inherently not recycleable and making it seem like plastic pollution is every day people's fault for not recycling enough.
AI's impact on the environment is so tiny it's comparable to a rounding error when held up against the output of say, global shipping or air travel.
Why don't people get this upset at airport expansions? They're vastly worse.
reply