Helped by GPT-5, Then Left to Their Own Devices: A Randomized Trial Measures the Learning Cost of AI Assistance

What remains of skills when the assistant disappears? A series of randomized controlled trials, published on arXiv in April, provides one of the first causal answers: training with an AI assistant reduces persistence and worsens independent performance — even on a task as basic as fraction calculation. The study is authored by Grace Liu (Carnegie Mellon), along with Brian Christian and Tsvetomira Dumbalska (Oxford), Michiel A. Bakker (MIT), and Rachit Dubey (UCLA) — Christian being the author of The Alignment Problem.

The protocol

The researchers recruited 1,222 participants in total, randomly assigned across three experiments. In the main one, participants worked through 12 fraction problems — with or without a GPT-5-based assistant — and then all took the same final test of 3 problems, without assistance, with a “skip” button available at any time to abandon a problem. A replication study (667 participants) made the setup stricter with a pretest, and a third experiment adapted the protocol to reading comprehension.

The results

The gap is clear. In the final no-AI test of the main experiment, the group that had previously used the assistant solved 57% of the problems, compared with 73% for the group that had trained alone; the abandonment rate nearly doubled, from 11% to 20%. The replication reproduced the effect, though attenuated (71% versus 77%), and the reading-comprehension task confirmed it (76% versus 89%, with eight times more abandonments). All this after only about ten minutes of exposure: the assistance did not merely shift competence onto the tool, it also weakened the willingness to exert effort itself. The authors interpret this as conditioning: AI trains users to expect immediate answers and deprives them of the experience of overcoming difficulty.

What the study proves — and what it does not

The strength of the result lies in the design: random assignment allows for a causal interpretation, whereas most work on AI-related “cognitive atrophy” relies on correlations or self-reports. Its limitations are equally clear. This is a preprint, not yet peer-reviewed. The tasks remain narrow — fractions and short-text reading — the time horizon is immediate, only one model was tested, and the effect is noticeably weaker in the stricter replication. The study establishes a mechanism, not a general law.

Why the result matters

Even so, the mechanism raises concerns far beyond the lab. In education, it documents the scenario teachers fear: a tool that improves immediate output while eroding the ability to perform without it. In business, where assistants are rolled out as implicit training tools, it suggests that assisted performance can be a misleading indicator of a team’s real competence. And for assistant designers, it argues for design choices that are still uncommon: deliberate friction, training modes in which the AI guides without solving. The question is no longer whether assistance helps — it does — but what it leaves behind when it is removed.

Stephane Nachez

ActuIA editorial team — news, data and analysis on artificial intelligence for decision-makers.

Helped by GPT-5, Then Left to Their Own Devices: A Randomized Trial Measures the Learning Cost of AI Assistance

The protocol

The results

What the study proves — and what it does not

Why the result matters

Machine unlearning: Google Research validates an audit test, but not yet on LLMs

The preprint ExpGraph proposes a self-evolving graph memory for LLM agents

GPT More Confident on Difficult Tasks Where It Makes the Most Mistakes, According to a USC/Berkeley Preprint