Save The Planet
Save The Planet


Save The Planet
You're viewing a single thread.
I know she's exaggerating but this post yet again underscores how nobody understands that it is training AI which is computationally expensive. Deployment of an AI model is a comparable power draw to running a high-end videogame. How can people hope to fight back against things they don't understand?
She's not exaggerating, if anything she's undercounting the number of tits.
Well you asked for six tits but you're getting five. Why? Because the AI is intelligent and can count, obviously.
I mean, continued use of AI encourages the training of new models. If nobody used the image generators, they wouldn't keep trying to make better ones.
you are correct, and also not in any way disagreeing with me.
I try lol.
TBH most people still use old SDXL finetunes for porn, even with the availability of newer ones.
It's closer to running 8 high-end video games at once. Sure, from a scale perspective it's further removed from training, but it's still fairly expensive.
nice name btw
really depends. You can locally host an LLM on a typical gaming computer.
You can, but that's not the kind of LLM the meme is talking about. It's about the big LLMs hosted by large companies.
True, and that's how everyone who is able should use AI, but OpenAI's models are in the trillion parameter range. That's 2-3 orders of magnitude more than what you can reasonably run yourself
This is still orders of magnitude less than what it takes to run an EV, which are an eco-friendly form of carbrained transportation. Especially if you live in an area where the power source is renewable. On that note, it looks to me like AI is finally going to be the impetus to get the U.S. to invest in and switch to nuclear power -- isn't that altogether a good thing for the environment?
lol just purely wrong on that one. Hilarious.
Well that's sort of half right. Yes you can run the smaller models locally, but usually it's the bigger models that we want to use. It would also be very slow on a typical gaming computer and even a high end gaming computer. To make it go faster not only is the hardware used in datacenters more optimised for the task, it's also a lot faster. This is both a speed increase per unit as well as more units being used than you would normally find in a gaming PC.
Now these things aren't magic, the basic technology is the same, so where does the speed come from? The answer is raw power, these things run insane amounts of power through them, with specialised cooling systems to keep them cool. This comes at the cost of efficiency.
So whilst running a model is much cheaper compared to training a model, it is far from free. And whilst you can run a smaller model on your home PC, it isn't directly comparable to how it's used in the datacenter. So the use of AI is still very power hungry, even when not counting the training.
Yeh but those local models are usually pretty underpowered compared to the ones that run via online services, and are still more demanding than any game.
Not at all. Not even close.
Image generation is usually batched and takes seconds, so 700W (a single H100 SXM) for a few seconds for a batch of a few images to multiple users. Maybe more for the absolute biggest (but SFW, no porn) models.
LLM generation takes more VRAM, but is MUCH more compute-light. Typically one has banks of 8 GPUs in multiple servers serving many, many users at once. Even my lowly RTX 3090 can serve 8+ users in parallel with TabbyAPI (and modestly sized model) before becoming more compute bound.
So in a nutshell, imagegen (on an 80GB H100) is probably more like 1/4-1/8 of a video game at once (not 8 at once), and only for a few seconds.
Text generation is similarly efficient, if not more. Responses take longer (many seconds, except on special hardware like Cerebras CS-2s), but it parallelized over dozens of users per GPU.
This is excluding more specialized hardware like Google's TPUs, Huawei NPUs, Cerebras CS-2s and so on. These are clocked far more efficiently than Nvidia/AMD GPUs.
...The worst are probably video generation models. These are extremely compute intense and take a long time (at the moment), so you are burning like a few minutes of gaming time per output.
ollama/sd-web-ui are terrible analogs for all this because they are single user, and relatively unoptimized.
How exactly did you come across this "fact"?
I compared the TDP of an average high-end graphics card with the GPUs required to run big LLMs. Do you disagree?
I do, because they're not at full load the entire time it's in use
They are, it'd be uneconomical not to use them fully the whole time. Look up how batching works.
I mean I literally run a local LLM, while the model sits in memory it's really not using up a crazy amount of resources, I should hook up something to actually measure exactly how much it's pulling vs just looking at htop/atop and guesstimating based on load TBF.
Vs when I play a game and the fans start blaring and it heats up and you can clearly see the usage increasing across various metrics
He isn't talking about locally, he is talking about what it takes for the AI providers to provide the AI.
To say "it takes more energy during training" entirely depends on the load put on the inference servers, and the size of the inference server farm.
There's no functional difference aside from usage and scale, which is my point.
I find it interesting that the only actual energy calculations I see from researchers is the training and the things going along with the training, rather then the usage per actual request after training.
People then conflate training energy costs to normal usage cost without data to back it up. I don't have the data either but I do have what I can do/see on my side.
I'm not sure that's true, if you look up things like "tokens per kwh" or "tokens per second per watt" you'll get results of people measuring their power usage while running specific models in specific hardware. This is mainly for consumer hardware since it's people looking to run their own AI servers who are posting about it, but it sets an upper bound.
The AI providers are right lipped about how much energy they use for inference and how many tokens they complete per hour.
You can also infer a bit by doing things like looking up the power usage of a 4090, and then looking at the tokens per second perf someone is getting from a particular model on a 4090 (people love posting their token per second performance every time a new model comes out), and extrapolate that.
One user vs a public service is apples to oranges and it's actually hilarious you're so willing to compare them.
It's literally the same thing, the obvious difference is how much usage it's getting at a time per gpu, but everyone seems to assume all these data centers are running at full load at all times for some reason?
It's explicitly and literally not the same thing.
The highest likelihood is you have literally no idea how any of this works and are just joining the crowd of AI bad because energy usage and have done zero research or hands on knowledge of how these tools actually work.
My guy, we're not talking about just leaving a model loaded, we're talking about actual usage in a cloud setting with far more GPUs and users involved.
So you think they're all at full load at all times? Does that seem reasonable to you?
Given that cloud providers are desperately trying to get more compute resources, but are limited by chip production - yes, of course? Why do you think they're trying to expand their resources while their existing resources aren't already limited?
Because they want the majority of the new chips for training models, not running the existing ones would be my assertion. Two different use cases
Right, but that's kind of like saying "I don't kill babies" while you use a product made from murdered baby souls. Yes you weren't the one who did it, but your continued use of it caused the babies too be killed.
There is no ethical consumption under capitalism and all that, but I feel like here is a line were crossing. This fruit is hanging so low it's brushing the grass.
"The plane is flying, anyway."
Are you interpreting my statement as being in favour of training AIs?
I'm interpreting your statement as "the damage is done so we might as well use it"
\
And I'm saying that using it causes them to train more AIs, which causes more damage.
I agree with your second statement. You have misunderstood me. I am not saying the damage is done so we might as well use it. I am saying people don't understand that it is the training of AIs which is directly power-draining.
I don't understand why you think that my observation people are ignorant about how AIs work is somehow an endorsement that we should use AIs.
I guess.
It still smells like an apologist argument to be like "yeah but using it doesn't actually use a lot of power".
I'm actually not really sure I believe that argument either, through. I'm pretty sure that inference is hella expensive. When people talk about training, they don't talk about the cost to train on a single input, they talk about the cost for the entire training. So why are we talking about the cost to infer on a single input?
\
What's the cost of running training, per hour? What's the cost of inference, per hour, on a similarly sized inference farm, running at maximum capacity?
Maybe you should stop smelling text and try reading it instead. :P
Running an LLM in deployment can be done locally on one's machine, on a single GPU, and in this case is like playing a video game for under a minute. OpenAI models are larger than by a factor of 10 or more, so it's maybe like playing a video game for 15 minutes (obviously varies based on the response to the query.)
It makes sense to measure deployment usage marginally based on its queries for the same reason it makes sense to measure the environmental impact of a car in terms of hours or miles driven. There's no natural way to do this for training though. You could divide training by the number of queries, to amortize it across its actual usage, which would make it seem significantly cheaper, but it comes with the unintuitive property that this amortization weight goes down as more queries are made, so it's unclear exactly how much of the cost of training should be assigned to a given query. It might make more sense to talk in terms of expected number of total queries during the lifetime deployment of a model.
You're way overcomplicating how it could be done. The argument is that training takes more energy:
Typically if you have a single cost associated with a service, then you amortize that cost over the life of the service: so you take the total energy consumption of training and divide it by the total number of user-hours spent doing inference, and compare that to the cost of a single user running inference for an hour (which they can estimate by the number of user-hours in an hour divided by their global inference energy consumption for that hour).
If these are "apples to orange" comparisons, then why do people defending AI usage (and you) keep making the comparison?
But even if it was true that training is significantly more expensive that inference, or that they're inherently incomparable, that doesn't actually change the underlying observation that inference is still quite energy intensive, and the implicit value statement that the energy spent isn't worth the affect on society
That's a good point. I rescind my argument that training is necessarily more expensive than sum-of-all-deployment.
I still think people overestimate the power draw of AI though, because they're not dividing it by the overall usage of AI. If people started playing high-end video games at the same rate AI is being used, the power usage might be comparable, but it wouldn't mean that an individual playing a video game is suddenly worse for the environment than it was before. However, it doesn't really matter, since ultimately the environmental impact depends only on the total amount of power (and coolant) used, and where that power comes from (could be coal, could be nuclear, could be hydro).
You're absolutely right that the environmental impact depends on the source of the energy, and less obviously, by the displaced demand that now has to seek energy from less clean sources. Ideally we should have lots of clean energy, but unfortunately we often don't, and even when AI uses clean sources, they're often just forcing preexisting load elsewhere. If we can start investing in power infrastructure projects at the national (or state/province level) then maybe it wouldn't be so bad, but it never happens at a scale that we need.
I think the argument isn't the environmental impact alone, it's the judgement about the net benefit of both the environmental impact and the product produced. I think the statement is "we spent all this power, and for what? Some cats with tits and an absolutely destroyed labour market. Not worth the cost"
\
Especially because it's a cost that the users of AI are forcing everyone to pay. Privatize profits, socialize losses, and all that.
I think a different way to look at what you've brought up in the second paragraph is that people are angry and talking about the power usage because the dislike AI, not the other way around. It doesn't really make sense for people to be angry about the power usage of AI if the power usage had no environmental impact.
How about, fuck AI, end story.
how about, fuck capitalism? Have you lost sight of the goal?
What tools do you think capitalism is going to use to fuck us harder and faster than ever before?
All of them at their disposal, we should get rid of all tools
Running a concept to its extreme just to try and dismiss it to sound smart is an entire damn logical fallacy. Why are you insisting on using fallacies that brainless morons use?
Have you never heard of a straw man fallacy? That's you. That's what you're doing.
So mad at a joke
Still dodging any actual point. Have fun being too stupid to actually engage with a discussion. Genuinely pitiful of you.
Ah yes, the bastion of intelligence displayed in your comment is too staggering for my miniscule brain to comprehend.
Go argue with someone else dick
I did, as a matter of fact, fuck AI.
You thought blind anger came from well informed opinions?
But then the rage machine couldn't rage
there is so much rage today. why don't we uh, destroy them with facts and logic
Hahaha at this point even facts and logic is a rage inducing argument. "My facts" vs "Your facts"