10mo ago

Kids who use ChatGPT as a study assistant do worse on tests

hechingerreport.org Kids who use ChatGPT as a study assistant do worse on tests

Researchers compare math progress of almost 1,000 high school students

Does AI actually help students learn? A recent experiment in a high school provides a cautionary tale.

Researchers at the University of Pennsylvania found that Turkish high school students who had access to ChatGPT while doing practice math problems did worse on a math test compared with students who didn’t have access to ChatGPT. Those with ChatGPT solved 48 percent more of the practice problems correctly, but they ultimately scored 17 percent worse on a test of the topic that the students were learning.

A third group of students had access to a revised version of ChatGPT that functioned more like a tutor. This chatbot was programmed to provide hints without directly divulging the answer. The students who used it did spectacularly better on the practice problems, solving 127 percent more of them correctly compared with students who did their practice work without any high-tech aids. But on a test afterwards, these AI-tutored students did no better. Students who just did their practice problems the old fashioned way — on their own — matched their test scores.

192 comments

Traditional instruction gave the same result as a bleeding edge ChatGPT tutorial bot. Imagine what would happen if a tiny fraction of the billions spent to develop this technology went into funding improved traditional instruction.
Better paid teachers, better resources, studies geared at optimizing traditional instruction, etc.
Move fast and break things was always a stupid goal. Turbocharging it with all this money is killing the tried and true options that actually produce results, while straining the power grid and worsening global warming.
- Investing in actual education infrastructure won't get VC techbros their yachts, though.
  
  It’s the other way round: Education makes for less gullible people and for workers that demand more rights more freely and easily - and then those are coming for their yachts…
- Imagine all the money spent on war would be invested into education 🫣what a beautiful world we would live in.
- And cracking open a book didn’t demolish the environment. Weird.
- Traditional instruction gave the same result as a bleeding edge ChatGPT tutorial bot.
  Interesting way of looking at it. I disagree with your conclusion about the study, though.
  It seems like the AI tool would be helpful for things like assignments rather than tests. I think it's intellectually dishonest to ignore the gains in some environments because it doesn't have gains in others.
  You're also comparing a young technology to methods that have been adapted over hundreds of thousands of years. Was the first automobile entirely superior to every horse?
  I get that some people just hate AI because it's AI. For the people interested in nuance, I think this study is interesting. I think other studies will seek to build on it.
  
  The point of assignments is to help study for your test.
  Homework is forced study. If you’re just handed the answers, you will do shit on the test.
- The education system is primarily about controlling bodies and minds. So any actual education is counter-productive.
- LLMs/GPT, and other forms of the AI boogeyman, are all just a tool we can use to augment education when it makes sense. Just like the introduction of calculators or the internet, AI isn't going to be the easy button, nor is it going to steal all teachers' jobs. These tools need to be studied, trained for, and applied purposely in order to be most effective.
  EDIT: Downvoters, I'd appreciate some engagement on why you disagree.
  
  are all just a tool
  just a tool
  it's just a tool
  a tool is a tool
  all are just tools
  it's no more than a tool
  it's just a tool
  it's a tool we can use
  one of our many tools
  it's only a tool
  these are just tools
  a tool for thee, a tool for me
  guns don't kill people, people kill people
  the solution is simple:
  teach drunk people not to shoot their guns so much
  unless they want to
  that is the American way
  tanks don't kill people, people kill people
  the solution is simple:
  teach drunk people not to shoot their tanks so much
  the barista who offered them soy milk
  wasn't implying anything about their T levels
  that is the American way
  Thanks for reminding me that AI is just tools, friend.
  My memory is not so good.
  I often can't
  remember

I don't even know of this is ChatGPT's fault. This would be the same outcome if someone just gave them the answers to a study packet. Yes, they'll have the answers because someone (or something) gave it to them, but won't know how to get that answer without teaching them. Surprise: For kids to learn, they need to be taught. Shocker.
- I've found chatGPT to be a great learning aid. You just don't use it to jump straight to the answers, you use it to explore the gaps and edges of what you know or understand. Add context and details, not final answers.
  
  The study shows that once you remove the LLM though, the benefit disappears. If you rely on an LLM to help break things down or add context and details, you don't learn those skills on your own.
  I used it to learn some coding, but without using it again, I couldn't replicate my own code. It's a struggle, but I don't think using it as a teaching aid is a good idea yet, maybe ever.

Kids who take shortcuts and don't learn suck at recalling knowledge they never had..
- The only reason we're trying to somehow compromise and allow or even incorporate cheating software into student education is because the tech-bros and singularity cultists have been hyping this technology like it's the new, unstoppable force of nature that is going to wash over all things and bring about the new Golden Age of humanity as none of us have to work ever again.
  Meanwhile, 80% of AI startups sink and something like 75% of the "new techs" like AI drive-thru orders and AI phone support go to call centers in India and Philippines. The only thing we seem to have gotten is the absolute rotting destruction of all content on the internet and children growing up thinking it's normal to consume this watered-down, plagiarized, worthless content.
- I took German in high school and cheated by inventing my own runic script. I would draw elaborate fantasy/sci-fi drawings on the covers of my notebooks with the German verb declensions and whatnot written all over monoliths or knight's armor or dueling spaceships, using my own script instead of regular characters, and then have these notebook sitting on my desk while taking the tests. I got 100% on every test and now the only German I can speak is the bullshit I remember Nightcrawler from the X-Men saying. Unglaublich!
  
  Meanwhile the teacher was thinking, "interesting tactic you've got there, admiring your art in the middle of a test"
  
  I just wrote really small on a paper in my glasses case, or hidden data in the depths of my TI86.
  We love Nightcrawler in this house.
- Good tl;dr
- Actually if you read the article ChatGPT is horrible at math a modified version where chatGPT was fed the correct answers with the problem didn't make the kids stupider but it didn't make them any better either because they mostly just asked it for the answers.

At work we give a 16/17 year old, work experience over the summer. He was using chatgpt and not understanding the code that was outputing.
I his last week he asked why he doing print statement something like
print (f"message {thing} ")
- Sounds like operator error because he could have asked chatGPT and gotten the correct answer about python f strings...
  
  Students first need to learn to:
  Break down the line of code, then
  Ask the right questions
  The student in question probably didn't develop the mental faculties required to think, "Hmm... what the 'f'?"
  A similar thingy happened to me having to teach a BTech grad with 2 years of prior exp. At first, I found it hard to believe how someone couldn't ask such questions from themselves, by themselves. I am repeatedly dumbfounded at how someone manages to be so ignorant of something they are typing and recently realising (after interaction with multiple such people) that this is actually the norm^[and that I am the weirdo for trying hard and visualising the C++ abstract machine in my mind].
  
  It all depends on how and what you ask it, plus an element of randomness. Remember that it's essentially a massive text predictor. The same question asked in different ways can lead it into predicting text based on different conversations it trained on. There's a ton of people talking about python, some know it well, others not as well. And the LLM can end up giving some kind of hybrid of multiple other answers.
  It doesn't understand anything, it's just built a massive network of correlations such that if you type "Python", it will "want" to "talk" about scripting or snakes (just tried it, it preferred the scripting language, even when I said "snake", it asked me if I wanted help implementing the snake game in Python 😂).
  So it is very possible for it to give accurate responses sometimes and wildly different responses in other times. Like with the African countries that start with "K" question, I've seen reasonable responses and meme ones. It's even said there are none while also acknowledging Kenya in the same response.
- Im afraid to ask, but whats wrong with that line? In the right context thats fine to do no?
  
  There is nothing wrong with it. He just didn't know what it meant after using it for a little over a month.

no shit
- "tests designed for use by people who don't use chatgpt is performed by people who don't"
  This is the same fn calculator argument we had 20 years ago.
  A tool is a tool. It will come in handy, but if it will be there in life, then it's a dumb test
  
  The point of learning isn't just access to that information later. That basic understanding gets built on all the way up through the end of your education, and is the base to all sorts of real world application.
  There's no overlap at all between people who can't pass a test without an LLM and people who understand the material.
  
  As someone who has taught math to students in a classroom, unless you have at least a basic understanding of HOW the numbers are supposed to work, the tool - a calculator - is useless. While getting the correct answer is important, I was more concerned with HOW you got that answer. Because if you know how you got that answer, then your ability to get the correct answer skyrockets.
  Because doing it your way leads to blindly relying on AI and believing those answers are always right. Because it's just a tool right?
  
  The main goal of learning is learning how to learn, or learning how to figure new things out. If "a tool can do it better, so there is no point in not allowing it" was the metric, we would be doing a disservice because no one would understand why things work the way they do, and thus be less equipped to further our knowledge.
  This is why I think common core, at least for math, is such a good thing because it teaches you methods that help you intuitively figure out how to get to the answer, rather than some mindless set of steps that gets you to the answer.

- IKR?
  Cheaters who cheat rather than learn don't learn. More on this shocking development at 11.
  
  Using ChatGPT as a study aid is cheating how?

Yea, this highlights a fundamental tension I think: sometimes, perhaps oftentimes, the point of doing something is the doing itself, not the result.
Tech is hyper focused on removing the "doing" and reproducing the result. Now that it's trying to put itself into the "thinking" part of human work, this tension is making itself unavoidable.
I think we can all take it as a given that we don't want to hand total control to machines, simply because of accountability issues. Which means we want a human "in the loop" to ensure things stay sensible. But the ability of that human to keep things sensible requires skills, experience and insight. And all of the focus our education system now has on grades and certificates has lead us astray into thinking that the practice and experience doesn't mean that much. In a way the labour market and employers are relevant here in their insistence on experience (to the point of absurdity sometimes).
Bottom line is that we humans are doing machines, and we learn through practice and experience, in ways I suspect much closer to building intuitions. Being stuck on a problem, being confused and getting things wrong are all part of this experience. Making it easier to get the right answer is not making education better. LLMs likely have no good role to play in education and I wouldn't be surprised if banning them outright in what may become a harshly fought battle isn't too far away.
All that being said, I also think LLMs raise questions about what it is we're doing with our education and tests and whether the simple response to their existence is to conclude that anything an LLM can easily do well isn't worth assessing. Of course, as I've said above, that's likely manifestly rubbish ... building up an intelligent and capable human likely requires getting them to do things an LLM could easily do. But the question still stands I think about whether we need to also find a way to focus more on the less mechanical parts of human intelligence and education.
- LLMs likely have no good role to play in education and I wouldn't be surprised if banning them outright in what may become a harshly fought battle isn't too far away.
  While I agree that LLMs have no place in education, you're not going to be able to do more than just ban them in class unfortunately. Students will be able to use them at home, and the alleged "LLM detection" applications are no better than throwing a dart at the wall. You may catch a couple students, but you're going to falsely accuse many more. The only surefire way to catch them is them being stupid and not bothering to edit what they turn in.
  
  Yea I know, which is why I said it may become a harsh battle. Not being in education, it really seems like a difficult situation. My broader point about the harsh battle was that if it becomes well known that LLMs are bad for a child’s development, then there’ll be a good amount of anxiety from parents etc.

Kids using an AI system trained on edgelord Reddit posts aren’t doing well on tests?
Ya don’t say.

Like any tool, it depends how you use it. I have been learning a lot of math recently and have been chatting with AI to increase my understanding of the concepts. There are times when the textbook shows some steps that I don't understand why they're happening and I've questioned AI about it. Sometimes it takes a few tries of asking until you figure out the right question to ask to get the right answer you need, but that process of thinking helps you along the way anyways by crystallizing in your brain what exactly it is that you don't understand.
I have found it to be a very helpful tool in my educational path. However I am learning things because I want to understand them, not because I have to pass a test and that determination in me to want to understand is a big difference. Just getting hints to help you solve the problem might not really help in the long run, but it you're actually curious about what you're learning and focus on getting a deeper understanding of why and how something works rather than just getting the right answer, it can be a very useful tool.
- Why are you so confident that the things you are learning from AI are correct? Are you just using it to gather other sources to review by hand or are you trying to have conversations with the AI?
  We've all seen AI get the correct answer but the show your work part is nonsense, or vice versa. How do you verify what AI outputs to you?
  
  You check it's work. I used it to calculate efficiency in a factory game and went through and made corrections to inconsistencies I spotted. Always check it's work.
  
  I use it for explaining stuff when studying for uni and I do it like this: If I don't understand e.g. a definition, I ask an LLM to explain it, read the original definition again and see if it makes sense.
  This is an informal approach, but if the definition is sufficiently complex, false answers are unlikely to lead to an understanding. Not impossible ofc, so always be wary.
  For context: I'm studying computer science, so lots of math and theoretical computer science.
  
  I'm not at all confident in the answers directly. I've gotten plenty of wrong answers form AI and I've gotten plenty of correct answers. If anything it's just more practice for critical thinking skills, separating what is true and what isn't.
  When it comes to math though, it's pretty straightforward, I'm just looking for context on some steps in the problems, maybe reminders of things I learned years ago and have forgotten, that sort of thing. As I said, I'm interested in actually understanding the stuff that I'm learning because I am using it for the things I'm working on so I'm mainly reading through textbooks and using AI as well as other sources online to round out my understanding of the concepts. If I'm getting the right answers and the things I am doing are working, it's a good indicator I'm on the right path.
  It's not like I'm doing cutting edge physics or medical research where mistakes could cause lives.
  
  I personally use it's answers as a jumping off point to do my own research, or I ask it for sources directly about things and check those out. I frequently use LLMs for learning about topics, but definitely don't take anything they say at face value.
  For a personal example, I use ChatGPT as my personal Japanese tutor. I use it discuss and break down nuances of various words or sayings, names of certain conjugation forms etc. etc., and it is absolutely not 100% correct, but I can now take the names of things that it gives me in native Japanese that I never would have known and look them up using other resources. Either it's correct and I find confirming information, or it's wrong and I can research further independently or ask it follow up questions. It's certainly not as good as a human native speaker, but for $20 a month and as someone who likes enjoys doing their own research, I fucking love it.
  
  I, like the OP, was also studying math from a textbook and using GPT4 to help clear things up. GPT4 caught an error in the textbook.
  The LLM doesn't have a theory of mind, it wont start over and try to explain a concept from a completely new angle, it mostly just repeats the same stuff over and over. Still, once I have figured something out, I can ask the LLM if my ideas are correct and it sometimes makes small corrections.
  Overall, most of my learning came from the textbook, and talking with the LLM about the concepts I had learned helped cement them in my brain. I didn't learn a whole lot from the LLM directly, but it was good enough to confirm what I learned from the textbook and sometimes correct mistakes.
  
  He is cross checking
  
  I mean, why are you confident the work in textbooks is correct? Both have been proven unreliable, though I will admit LLMs are much more so.
  The way you verify in this instance is actually going through the work yourself after you’ve been shown sources. They are explicitly not saying they take 1+1=3 as law, but instead asking how that was reached and working off that explanation to see if it makes sense and learn more.
  Math is likely the best for this too. You have undeniable truths in math, it’s true, or it’s false. There are no (meaningful) opinions on how addition works other than the correct one.
- Sometimes it leads me wildly astray when I do that, like a really bad tutor...but it is good if you want a refresher and can spot the bullshit on the side. It is good for spotting things that you didnt know before and can factcheck afterwards.
  ...but maybe other review papers and textbooks are still better...

Of all the students in the world, they pick ones from a "Turkish high school". Any clear indication why there of all places when conducted by a US university?
- I'm guessing there was a previous connection with some of the study authors.
  I skimmed the paper, and I didn't see it mention language. I'd be more interested to know if they were using ChatGPT in English or Turkish, and how that would affect performance, since I assume the model is trained on significantly more English language data than Turkish.
  
  GPTs are designed with translation in mind, so I could see it being extremely useful in providing me instruction on a topic in a non-English native language.
  But they haven’t been around long enough for the novelty factor to wear off.
  It’s like computers in the 1980s… people played Oregon Trail on them, but they didn’t really help much with general education.
  Fast forward to today, and computers are the core of many facets of education, allowing students to learn knowledge and skills that they’d otherwise have no access to.
  GPTs will eventually go the same way.
- The paper only says it's a collaboration. It's pretty large scale, so the opportunity might be rare. There's a chance that (the same or other) researchers will follow up and experiment in more schools.
- The names of the authors suggest there could be a cultural link somewhere.
  
  Ah thanks, that does appear to be the case.
- If I had access to ChatGPT during my college years and it helped me parse things I didn't fully understand from the texts or provided much-needed context for what I was studying, I would've done much better having integrated my learning. That's one of the areas where ChatGPT shines. I only got there on my way out. But math problems? Ugh.
  
  When you automate these processes you lose the experience. I wouldn’t be surprised if you couldn’t parse information as well as you can now, if you had access to chat GPT.
  It’s had to get better at solving your problems if something else does it for you.
  Also the reliability of these systems is poor, and they’re specifically trained to produce output that appears correct. Not actually is correct.
- The study was done in Turkey, probably because students are for sale and have no rights.
  It doesn't matter though. They could pick any weird, tiny sample and do another meaningless study. It would still get hyped and they would still get funding.

TLDR: ChatGPT is terrible at math and most students just ask it the answer. Giving students the ability to ask something that doesn't know math the answer makes them less capable. An enhanced chatBOT which was pre-fed with questions and correct answers didn't screw up the learning process in the same fashion but also didn't help them perform any better on the test because again they just asked it to spoon feed them the answer.

- Haven’t seen that in ages. Thanks.
  
  No worries. Somehow I use it quite often nowadays
- God I miss ytmnd https://owleyes.ytmnd.com/

I'm not entirely sold on the argument I lay out here, but this is where I would start were I to defend using chatGPT in school as they laid out in their experiment.
It's a tool. Just like a calculator. If a kid learns and does all their homework with a calculator, then suddenly it's taken away for a test, of course they will do poorly. Contrary to what we were warned about as kids though, each of us does carry a calculator around in our pocket at nearly all times.
We're not far off from having an AI assistant with us 24/7 is feasible. Why not teach kids to use the tools they will have in their pocket for the rest of their lives?
- I think here you also need to teach your kid not to trust unconditionally this tool and to question the quality of the tool. As well as teaching it how to write better prompts, this is the same like with Google, if you put shitty queries you will get subpar results.
  And believe me I have seen plenty of tech people asking the most lame prompts.
  
  I remember teachers telling us not to trust the calculators. What if we hit the wrong key? Lol
  Some things never change.
- As adults we are dubious of the results that AI gives us. We take the answers with a handful of salt and I feel like over the years we have built up a skillset for using search engines for answers and sifting through the results. Kids haven't got years of experience of this and so they may take what is said to be true and not question the results.
  As you say, the kids should be taught to use the tool properly, and verify the answers. AI is going to be forced onto us whether we like it or not, people should be empowered to use it and not accept what it puts out as gospel.
  
  This is true for the whole internet, not only AI Chatbots. Kids need to get teached that there is BS around. In fact kids had to learn that even pre-internet. Every human has to learn that you can not blindly trust anything, that one has to think critically. This is nothing new. AI chatbots just show how flawed human education is these days.
- It’s a tool. Just like a calculator.
  lol my calculator never "hallucinated".
  
  Yeah it's like if you had a calculator and 10% of the time it gave you the wrong answer. Would that be a good tool for learning? We should be careful when using these tools and understand their limitations. Gen AI may give you an answer that happens to be correct some of the time (maybe even most of the time!) but they do not have the ability to actually reason. This is why they give back answers that we understand intuitively are incorrect (like putting glue on pizza), but sometimes the mistakes can be less intuitive or subtle which is worse in my opinion.
  
  Ask your calculator what 1-(1-1e-99) is and see if it never halucinates (confidently gives an incorrect answer) still.

Something I've noticed with institutional education is that they're not looking for the factually correct answer, they're looking for the answer that matches whatever you were told in class. Those two things should not be different, but in my experience, they're not always the same thing.
I have no idea if this is a factor here, but it's something I've noticed. I have actually answered questions with a factually wrong answer, because that's what was taught, just to get the marks.

Taking too many shortcuts doesn't help anyone learn anything.

This isn't a new issue. Wolfram alpha has been around for 15 years and can easily handle high school level math problems.
- Except wolfram alpha is able to correctly explain step by step solutions. Which was an aid in my education.

ChatGPT lies which is kind of an issue in education.
As far as seeing the answer, I learned a significant amount of math by looking at the answer for a type of question and working backwards. That's not the issue as long as you're honestly trying to understand the process.

I've found AI helpful in asking for it to explain stuff. Why is the problem solved like this, why did you use this and not that, could you put it in simpler terms and so on. Much like you might ask a teacher.
- I think this works great if the student is interested in the subject, but if you're just trying to work through a bunch of problems so you can stop working through a bunch of problems, it ain't gonna help you.
  I have personally learned so much from LLMs (although you can't really take anything at face value and have to look things up independently, but it gives you a great starting place), but it comes from a genuine interest in the questions I'm asking and things I dig at.
  
  I have personally learned so much from LLMs
  No offense but that's what the article is also highlighting, naming that students, even the good, believe they did learn. Once it's time to pass a test designed to evaluate if they actually did, it's not that positive.
- To an extent, but it's often just wrong about stuff.
  It's been a good second step for things I have questions about that I can't immediately find good search results for. I don't wanna get off topic but I have major beef with Stack Overflow and posting questions there makes me anxious as hell because I'll do so much diligence to make sure it is clear, reproducible, and not a duplicate only for my questions to still get closed. It's a major fucking waste of my time. Why put all that effort in when it's still going to get closed?? Anyways -- ChatGPT never gets mad at me. Sure, it's often wrong as hell but it never berates me or makes me feel stupid for asking a question. It generally gets me close enough on topics that I can search for other terms in search engines and get different results that are more helpful.
- Yep. My first interaction with GPT pro lasted 36 hours and I nearly changed my religion.
  AI is the best thing to come to learning, ever. If you are a curious person, this is bigger than Gutenberg, IMO.
  
  That sounds like a manic episode

Maybe, if the system taught more of HOW to think and not WHAT. Basically more critical thinking/deduction.
This same kinda topic came up back when I was in middle/highschool when search engines became wide spread.
However, LLM's shouldn't be trusted for factual anything, same as Joe blows blog on some random subject. Did they forget to teach cross referencing too? I'm sounding too bitter and old so I'll stop.
- However, LLM’s shouldn’t be trusted for factual anything, same as Joe blows blog on some random subject.
  Podcasts are 100% reliable tho

What do the results of the third group suggest? AI doesn't appear to have hindered their ability to manage by themselves under test conditions, but it did help them significantly with their practice results. You could argue the positive reinforcement an AI tutor can provide during test preparations might help some students with their confidence and pre-exam nerves, which will allow them to perform closer to their best under exam conditions.
- It suggests that the best the chatbot can do, after being carefully tailored for its job, is no better than the old methods (because the goal is for the students to be able to handle the subject matter without having to check every common operation with a third party, regardless of whether that's a chatbot or a textbook, and the test is the best indicator of that). Therefore, spending the electricity to run an educational chatbot for highschoolers isn't justified at this time, but it's probably worth rechecking in a few years to see if its results have improved. It may also be worth doing extended testing to determine whether there are specific subsets of the student body that benefit more from the chatbot than others. And allowing the students to seek out an untailored chatbot on their own is strongly counterindicated.
  
  I would like to see it compared with human tutors too. This could be a more affordable alternative for students who need help outside of the classroom but can't afford paid tutoring.
- Yess
- Yep. But the post title suggest that all students who used ChatGPT did worse. Fuck this clickbait shit.

See also: competitive cognitive artifacts. https://philosophicaldisquisitions.blogspot.com/2016/09/competitive-cognitive-artifacts-and.html
These are artifacts that amplify and improve our abilities to perform cognitive tasks when we have use of the artifact but when we take away the artifact we are no better (and possibly worse) at performing the cognitive task than we were before.

Did those using tutor AI spend less time on learning? That would have been worth measuring
- Interesting thought, I would be curious about this too.

Youdontsay.png

Perhaps unsurprisingly. Any sort of "assistance" with answers will do that.
Students have to learn why things work the way they do, and they won't be able to grasp it without going ahead and doing every piece manually.

No shit

While I get that, AI could be handy for some subjects, where you wont put your future on. However using it extinsively for everything is quite an exaggeration.

Shocked, I tell you!

Unsurprised
I.would have no problem with AI if the shit actually worked
- No, I think the point here is that the kids never learned the material, not that AI taught them the wrong material (though there is a high possibility of that).
  
  Yes yet there is indeed a deeper point. If the AI is to be used as a teaching tool it still has to give genuinely useful advice. No good sounding advice that might actually still be wrong. LLMs can feed wrong final answers but they can also make poor suggestions on the process itself too. So there are both problematic, how the tool is used but also its intrinsic limitations.

Because AI and previously google searches are not a substitute for having knowledge and experience. You can learn by googling something and reading about how something works so you can figure out answers for yourself. But googling for answers will not teach you much. Even if it solves a problem, you won't learn how. And won't be able to fix something in the future without googling th answer again.
If you dont learn how to do something, you won't be experienced enough to know when you are doing it wrong.
I use google to give me answers all the time when im problem solving. But i have to spend a lot more time after the fact to learn why what i did fixed the problem.

Yeh because it's just like having their dumb parents do homework for them

Would kids do better if the AI doesn't hallucinate?
- Would snails be happier if it kept raining? What can we do to make it rain forever and all time?
- Paradoxically, they would probably do better if the AI hallucinated more. When you realize your tutor is capable of making mistakes, you can't just blindly follow their process; you have to analyze and verify their work, which forces a more complete understanding of the concept, and some insight into what errors can occur and how they might affect outcomes.

Kids who use ChatGPT as a study assistant do worse on tests
But on a test afterwards, these AI-tutored students did no better. Students who just did their practice problems the old fashioned way — on their own — matched their test scores
Headline: People who flip coins have a much worse chance of calling it if they call heads!
Text: Studies show that people who call heads when flipping coins have an even chance of getting it right compared to people who do the old fashion way of calling tails.
- You skipped the paragraph where they used two different versions of LLMs in the study. The first statement is regarding generic ChatGPT. The second statement is regarding an LLM designed to be a tutor without directly giving answers.
  
  I didn't skip it. If you are going to use a tool, use it right. "Study shows using the larger plastic end of screwdriver makes it harder to turn screws than just using fingers to twist them. Researchers caution against using screwdriver to turn screws."
- Thats a modified version, it says using unmodified ChatGPT results in 17% worse scores

The title is pretty misleading. Kids who used ChatGPT to get hints/explanations rather than outright getting the answers did as well as those who had no access to ChatGPT. They probably had a much easier time studying/understanding with it so it's a win for LLMs as a teaching tool imo.
- Is it really a win for LLMs if the study found no significant difference between those using it as a tutor and those not?
  
  As another poster questioned, if it saved them tine then, yes, it is absolutely a win. But if they spent the same amount of time, I would agree with you that it's not a win.
  
  Maybe using llm assistance was less stressful or quicker than self study. The tutoring focused llm is definitely better than allowing full access to gpt itself, which is what is currently happening
  
  Not everyone can afford a tutor or knows where to find an expert that can answer questions in any given domain. I think such a tool would have made understanding a lot of my college courses a lot easier.

192 comments