AI tutoring can safely and effectively support students: An exploratory RCT in UK classrooms

📅 2025-12-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the scalability challenge of personalized education—particularly high-cost one-to-one tutoring—by conducting a randomized controlled trial across five UK secondary schools to evaluate a generative AI tutor (LearnLM) for mathematics instruction. We introduce the “human-AI co-supervision” paradigm: human tutors review and lightly edit AI-generated content in real time, ensuring pedagogical safety and quality. LearnLM, built atop the Eedi platform, integrates Socratic questioning with tutor feedback loops. Results show tutors approved 76.4% of AI messages with zero or minimal edits; students’ overall learning outcomes were non-inferior to those in the fully human-tutored control group; critically, transfer problem-solving performance improved significantly by 5.5 percentage points (66.2% vs. 60.7%)—the first robust evidence from authentic classrooms demonstrating AI-augmented tutoring outperforming purely human tutoring. Additionally, tutors acquired novel instructional strategies through interaction with the system.

Technology Category

Application Category

📝 Abstract
One-to-one tutoring is widely considered the gold standard for personalized education, yet it remains prohibitively expensive to scale. To evaluate whether generative AI might help expand access to this resource, we conducted an exploratory randomized controlled trial (RCT) with $N = 165$ students across five UK secondary schools. We integrated LearnLM -- a generative AI model fine-tuned for pedagogy -- into chat-based tutoring sessions on the Eedi mathematics platform. In the RCT, expert tutors directly supervised LearnLM, with the remit to revise each message it drafted until they would be satisfied sending it themselves. LearnLM proved to be a reliable source of pedagogical instruction, with supervising tutors approving 76.4% of its drafted messages making zero or minimal edits (i.e., changing only one or two characters). This translated into effective tutoring support: students guided by LearnLM performed at least as well as students chatting with human tutors on each learning outcome we measured. In fact, students who received support from LearnLM were 5.5 percentage points more likely to solve novel problems on subsequent topics (with a success rate of 66.2%) than those who received tutoring from human tutors alone (rate of 60.7%). In interviews, tutors highlighted LearnLM's strength at drafting Socratic questions that encouraged deeper reflection from students, with multiple tutors even reporting that they learned new pedagogical practices from the model. Overall, our results suggest that pedagogically fine-tuned AI tutoring systems may play a promising role in delivering effective, individualized learning support at scale.
Problem

Research questions and friction points this paper is trying to address.

Evaluates AI tutoring's effectiveness in personalized education
Compares AI vs human tutoring on student learning outcomes
Explores scalable AI solutions for individualized academic support
Innovation

Methods, ideas, or system contributions that make the work stand out.

AI tutoring uses pedagogically fine-tuned generative model
Human experts supervise and edit AI-generated tutoring messages
AI system matches human tutor effectiveness in student outcomes
🔎 Similar Papers
No similar papers found.
L
LearnLM Team
Google & Eedi
A
Albert Wang
Google & Eedi
A
Aliya Rysbek
Google & Eedi
A
Andrea Huber
Google & Eedi
A
Anjali Nambiar
Google & Eedi
A
Anna Kenolty
Google & Eedi
B
Ben Caulfield
Google & Eedi
B
Beth Lilley-Draper
Google & Eedi
B
Bibi Groot
Google & Eedi
B
Brian Veprek
Google & Eedi
C
Chelsea Burdett
Google & Eedi
C
Claire Willis
Google & Eedi
C
Craig Barton
Google & Eedi
D
Digory Smith
Google & Eedi
G
George Mu
Google & Eedi
H
Harriet Walters
Google & Eedi
Irina Jurenka
Irina Jurenka
DeepMind
Artificial IntelligenceNeuroscienceUnsupervised LearningGenerative ModelsRepresentation
I
Iris Hulls
Google & Eedi
J
James Stalley-Moores
Google & Eedi
J
Jonathan Caton
Google & Eedi
J
Julia Wilkowski
Google & Eedi
K
Kaiz Alarakyia
Google & Eedi
Kevin R. McKee
Kevin R. McKee
Staff Research Scientist, Google DeepMind
Cooperative AIHuman dataSocial cognitionParticipatory AIReinforcement learning