Generative crash
Generative crash
Original post: social.coop (Mastodon)
Generative crash
Original post: social.coop (Mastodon)
i hope it does
I'm so happy to know world will crumble by LLM corpo bullshit rather than climate change. It looks MUCH more fun.
Clammy Sam: We need to build even bigger LLMs to fix the errors of our current LLMs
We should crowdsource the answer. Submit this as a CAPTCHA to 1000 computer-users and return the golden mean as a consensus response.
Didn't an excel error cause Greece to adopt austerity measures in response to their debt crisis?
That definitely happened in the UK.
i thought the EU practically forced them to adopt austerity?
I seem to remember the Germans saying no bail out if you don't fix your shit but can't remember what the demands were at the time.
A5: =COPILOT("THANKS!")
Nuclear power: intensifies.
Meanwhile, affable Hollywood personalities are like "huh?"
Best give that a thumbs down. Bad vibes!
A large language model shouldn't even attempt to do math imo. They made an expensive hammer that is semi good at one thing (parroting humans) and now they're treating every query like it's a nail.
Why isn't OpenAi working more modular whereby the LLM will call up specialized algorithms once it has identified the nature of the question? Or is it already modular and they just suck at anything that cannot be calibrated purely with brute force computing?
Why isn’t OpenAi working more modular whereby the LLM will call up specialized algorithms once it has identified the nature of the question?
Precisely because this is a LLM. It doesn't know the difference between writing out a maths problem, a recipe for cake or a haiku. It transforms everything into the same domain and is doing fancy statistics to come up with a reply. It wouldn't know that it needs to invoke the "Calculator" feature unless you hard code that in, which is what ChatGPT and Gemini do, but it's also easy to break.
Can't it be trained to do that?
In fairness the example I've seen MS give was taking a bunch of reviews and determining if the review was positive or negative from the text.
It was never meant to mangle numbers, but we all know it's going to be used for that anyway, because people still want to believe in a future where robots help them, rather than just take their jobs and advertise to them.
I would rather not have it attempt something that it can't do, no direct result is better than a wrong result imo. Here it's correctly identifying that it's a calculation question and instead of suggesting using a formula, it tries to hallucinate a numerical answer itself. The creators of the model seem to have a mindset that the model must try to answer no matter what, instead of training it to not answer questions that it can't answer correctly.
Why isn’t OpenAi working more modular whereby the LLM will call up specialized algorithms once it has identified the nature of the question?
Wasn't doing this part of the reason OpenSeek was able to compete with much smaller data sets and less hardware requirements?
I assume you mean DeepSeek? And it doesn't look like it, according to what I could find, their biggest innovation was "reinforcement learning to teach a base language model how to reason without any human supervision". https://huggingface.co/blog/open-r1
Some others have replied that chatgpt and copilot are already modular: they use python for arithmetic questions. But that apparently isn't enough to be useful.
They have (well, anthropic)
It's called MCP
Nowadays you just give the AI access to a calculator basically.. or whatever other tools it needs. Including other models to help it answer something.
I hadn't heard of that protocol before, thanks. It holds a lot of promise for the future, but getting the input right for the tools is probably quite the challenge. And it also scares me because I now expect more companies to release irresponsible integrations.
You have to admire the gall of naming your system after the evil AI villain of the original Tron movie.
Why isn’t OpenAi working more modular whereby the LLM will call up specialized algorithms once it has identified the nature of the question?
Because we vibes-coded the OpenAI model with OpenAI and it didn't think this was the optimal way to design itself.
the LLM will call up specialized algorithms once it has identified the nature of the question?
Because there is no "it" that "calls" or "identify" even less the "nature of the question".
This requires intelligence, not probability on the most likely next token.
You can do maths with words and you can write poem with numbers, either requires to actually understand, that's the linchpin, rather than parse and provide an answer based on what has been written so far.
Sure the model might string together tokens that sound very "appropriate" to the question, in the sense that it fits within the right vocabulary, and if its dataset the occurrence was just frequent enough it even be correct, but that's still not understanding even 1 single word (or token) within either the question or the corpus.
An llm can, by understanding some features of the input, predict a categorisation of the input and feed it to do different processors. This already works. It doesn't require anything beyond the capabilities llms actually have, and isn't perfect. It's a very good question why this hasn't happened here; an llm can very reliably give you the answer to "how do you sum some data in python" so it only needs to be able to do that in excel and put the result into the cell.
There are still plenty of pitfalls. This should not be one of them, so that's interesting.
Yep. Instead of focusing on humans communicating more effectively with computers, which are good at answering questions that have correct, knowable answers, we’ve invented a type of computer that can be wrong because maybe people will like the vibes more? (And we can sell vibes)
OpenAI already makes it write Python functions to do the calculations.
This is the right answer.
LLMs have already become this weird mesh of different services tied together to look more impressive. OpenAIs models can't do math and farm it out to python for accuracy.
So it's going to write python functions to calculate the answer where all the variables are stored in an Excel spreadsheet a program that can already do the calculations? And how many forests did we burn down for that wonderful piece of MacGyvered software I wonder.
The AI bubble cannot burst soon enough.
It equally fascinates and scares how widespread AI is already being adopted by companies, especially at this stage. I can understand playing around a little with AI, even if its energy requirements can pose an ethical dilemma, but actually implementing it in a workflow seems crazy to me.
Actually, I think the profit motive will correct the mistakes here.
If AI works in their workflow and uses less energy than before.. well, that's an improvement. If it uses more energy, they will revert back because it makes less economic sense.
This doesn't scare me at all. Most companies strive to stay as profitable as possible and if a 1+1 calculation costs a company more money by using AI to do it, they'll find a cheaper way.. like using a calculator like they have before.
We're just nearing the peak of the Gartner hype cycle so it seems like everyone is doing it and its being sold at a loss. This will correct.
I think the profit motive is what's driving it in many cases. I.e. shareholders have interest in AI companies/products and are pressuring the other companies they have interest in to use the AI services; increasing their profit. Profit itself is inefficiency (i.e. in a perfect market, profit approaches zero).
You put too much faith in people to make good decisions. This could decrease profits by a wide margin and they’d keep using it. Tbh some would keep with the decision even if it throws them into the red.
You have more faith in people than I do.
I have managers that get angry if you tell them about problems with their ideas. So we have to implement their ideas despite the fact that they will cause the company to lose money in the long run.
Management isn't run by bean counters (if it was it wouldn't be so bad), management is run by egos in suits. If they've stated their reputation on AI, they will dismiss any and all information that says that their idea is stupid
A problem right now is that most models are subsidied by investors. OpenAI's 2024 numbers were something like 2bn revenue ves 5bn expenses. All in the hopes of being the leader in a market that may not exist.
The problem is how long it takes to correct against stupid managers. Most companies aren't fully rational, it's just when you look at long term averages that the various stupidities usually cancel out (unless they bankrupt the company)
This doesn't scare me at all. Most companies strive to stay as profitable as possible and if a 1+1 calculation costs a company more money by using AI to do it, they'll find a cheaper way
This sounds like easy math, but AI doesn't use more or less energy. It's stated goal is to replace people. People have a much, much more complicated cost formula full of subjective measures. An AI doesn't need health insurance. An AI doesn't make public comments on social media that might reflect poorly on your company. An AI won't organize and demand a wage increase. An AI won't sue you for wrongful termination. An AI can be held responsible for a problem and it can be written off as "growing pains".
How long will the "potential" of the potential benefits encourage adopters to give it a little more time? How much damage will it do in the meantime?
I've had some fun trying to open old spreadsheet files. It's not been that painful. (Mostly because people I had to help never discovered macros. In optimal case they didn't even know about functions.) After all, you don't have weird external data sources. The spreadsheet is a frozen pile of data with strict rules.
I would love to be a fly in the wall when in 10 years someone needs to open an Excel file with Copilot stuff and needs fully reproducible results.
Wait, You're telling me it redoes all of the prompts every time you open the document. That's such a bad way of doing it this borderline criminal.
I did some digging. Apparently people are already raising eyebrows at the fact that it produces non-reproducible results, so much so that the official documentation says this:
The COPILOT function's model will evolve and improve in the future and formula results may change over time, even with the same arguments. Only use the COPILOT function where this is acceptable. If you don't want results to recalculate, consider converting them into values with Copy, Paste Values (select values, Ctrl + C, Ctrl + Shift + V).
...This is a note buried in the official help file. Gee gosh ding dilly damn I hope the people who use this function read the docs thorougly, just like they have done through the entire history of computing.
"Consider converting th-" [beer through her nose] CONSIDER???!? THIS IS A PRETTY CRUCIAL STEP.
They should apply the Balatro Misprint joker's visual effect on all of the cells affected by this function. To signal the users that they've entered the Undecidability Zone. Where Gödel only went in Thursdays and 8-10 Polynesian time in Tuesdays (subject to change).
they can't be THAT stupid, right? Right?
Imagine doing 30000 requests to a server in a different continent just to open the file and then close it immediately "oops, it's not what i was searching"
At the very least, why doesn't copilot just replace that prompt with the appropriate sum(A1:A3) command?
Then Microsoft can't motivate you to keep paying for the subscription
Looks like it's just producing random numbers. If you remove 'the' and only say 'sum numbers above', it'll result in 10. You can also do the same with no numbers anywhere in the sheet and the result will still be 15 and 10.
You can even tell it to 'sum the numbers from A1, A2, A3' and it'll yet again produce 15. As per the documentation, you should give it specific cells as context, e.g., 'COPILOT("sum the numbers", A1:A3)'. I can confirm this works, though I'm not sure why as the prompt sent to the model should be the exact same.
Definitely more convenient than =SUM(A1:A3)
No kidding. If I have to tell the stupid model what I'm talking about explicitly with a cell range, I might as well write the function myself. And if I'm an excel novice who doesn't know SUM(), then I should go learn about it to solve my problem instead of offloading to a model whose result I can't verify.
LLMs are not a replacement for knowledge and skill. It's a tool that might, might be able to accelerate things you already could do.
That said... can copilot do conditional formatting? "Highlight in red all rows where cell 5 is No" would actually be useful. Conditional formatting rules are madness.
I just really looked at the image. =COPILOT() is absolutely insane... Does that make a call to microsoft from this sheet every time it's opened? when the related cells change? when anything changes? If your document is 'financial forecast fy 2026', have you just given the whole doc as context to another company? sure, sure, sure, they said they wouldn't look at your stuff, but prove it.
From 10 to 33 keystrokes, what an improvement.
Without specifying cells it might be pulling numbers in it's system prompt instead
Ok I see what happened here. You said the numbers "above" and it saw A in the column name. In hexadecimal that's a 10. But you also said "numbers" plural, and "1" isn't plural. So it took A + 2 + 3 = 15.
Makes perfect sense, maybe just write better prompts next time. /s
I doubt it. The column name isn't part of the data.
However "the numbers above" is data.
3 letters + 7 letters + 5 letters= 15 letters.
Doesn't even need the /s. That is largely how those glorified search engines work.
No it isn't. The above is using logic, however bad it may be. LLMs are not. It's just a statistical model. It doesn't think.
Woah woah woah, stop it right there. I won't stand for slander against actual search engines!
Thank you for that. I have the prompt discussion with a friend all the time.
Relax bro its just vibe working it doesnt have to be correct
Ed Zitron wrote an article a while ago about Business Idiots. From what I recall, the people in charge of these big companies are out of touch with users and the product, and so they make nonsense decisions. Companies aren't run by the best and brightest. They're run by people who do best in that environment.
Microsoft seems to be full of business idiots.
WheresYourEd.At/the-era-of-the-business-idiot
So funny
To be clear, I am deeply unconvinced that Nadella actually runs his life in this way, but if he does, Microsoft’s board should fire him immediately.
Ed was just on an Ars Technica livestream 👌 (not posted [yet])
Stunned that they're fucking with their flagship Office product. Without Excel, everyone could simply drop Office.
Been a sysadmin at small companies for 10 years, and that means I'm the one vetting and purchasing software. Last shop was all in on Google for Business and Google for auth. Worked pretty well, but accounting and HR still had to have Excel.
It's not even so much that other software can't do simple Excel tasks, it's the risk of your numbers getting lost in translation. In any case, nothing holds a candle to the power of Excel. And now they want to fuck with it?!
Excel is often used by people that don't know what a database is, and you end up with thousands of rows of denormalised data just waiting for typos or extra white spaces to fuck up the precarious stack of macros and formulae. Never mind the update/insert anomalies and data corruption waiting to strike.
I have a passionate hate for Excel, but I understand that not everyone is willing to learn more robust data processing
It's probably because Access fucking sucks, leaving excel as the only database adjacent program available to office workers. I would love to be able to use anything but excel on my projects. Hell, python and some CSVs would make my life so much easier, but I ain't going through IT to let me have that, and it opens up a HUGE can of worms in my line of work if I start using homemade scripts. The execs can pay for a LIMS system if they want me more productive.
The precarious stack of macros and formulae that you also can't version control properly because it's a superficially-XML-ified memory dump, not textual source code.
Almost every nontrivial use of Excel would be better off as, if not a database, at least something like a Jupyter notebook with pandas.
Simple:
Whenever you see a bad, incorrect answer, always give the AI a shit ton of praise. If it. Gives a correct answer, chastise it to death.
Great. Now we're going to need therapists for AIs.
I've seen Gemini straight up trying to kill itself
Great, it's going to be bad for even longer
I don't see the problem. Sometimes it'll be fifteen, and then it will be perfect every time. This saves the user literal hours of time poring over documentation and agonizing over which esoteric function to use, which far outweigh the few times this number will be nine.
It's just a modernized version of XKCD 221
Or this
Dynamic typing issue?
1 + 2 = 12
12 + 3 = 15
1+2+3+AI
Like that LinkedIn lunatic, E = mc^2 + AI
Simple, 3+2 = 5. Add 1, which came before the 3 and 2, and you get 1 then 5, 15. It's new new math, or as I will henceforth call it, mAIth.
Tom Lehrer?
New-hoo-hoo math
It's so simple, so very simple
That only an AI can do it
Damn it now this is going to be stuck in my head for the day. "64. How did 64 get into it? I hear you cry."
not shown: row 5 - "January 25th"
Need more context. If this is an engineering calculation then it's wrong...but if it's just an upper level manager doing numeric gibberish, then it's probably no worse than their made-up input data anyway.
Has anyone actually tried to see if it's true? I don't have access to excel and copilot so I can't check.
One of the principal justifications for George Osborne’s 2010 austerity plans turned out to be erroneous thanks to an Excel error.
That was without any help from AI, things could be about to get much worse.
This will. Do not mess with Excel.
Make sure to thumbs up, like and subscribe!
The what?
Stop posting censored fucking memes you fucking twat.
Edit: I'm fucking sorry, but do you fucking people LIKE censorship?
Edit: I take it all back. I was drunk and the little lemmy symbol must have blurred onto the image. You are not a twat. Thanks for the concern about the carbon monoxide detector, but I was drunk.
Check your carbon monoxide detector is working. Seriously.
? There's nothing censored in this post, what are you seeing?
Silly copilot. Everyone knows the answer is 10 because A+1+2+3 = 10.
Nah A+1+2+3=65+49+50+51 so the answer is 215.
F
probably sales, if it's purchase it shows -15
on the one hand, obviously this is not the way you should be using that function. On the other hand, I'm not sure how you even should.
I don't think Microsoft even knows how you should