Exploring LLMs for Predicting Tutor Strategy and Student Outcomes in Dialogues

๐Ÿ“… 2025-07-09
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This study investigates the prediction of teacher instructional strategies in mathematics tutoring dialogues and their causal impact on student learning outcomes. Using two real-world mathematics tutoring dialogue datasets, we conduct the first systematic evaluation of Llama-3 and GPT-4o in modeling multi-turn strategy evolution, complemented by causal inference to assess the association between teacher strategies and subsequent student performance. Results reveal that current large language models exhibit significant limitations in long-horizon strategy trajectory prediction. However, teacher strategies themselves strongly predict downstream learning gains (AUC > 0.82), and this predictive power remains robust even after controlling for dialogue history. Our work establishes the critical insight that โ€œstrategy is a signalโ€โ€”i.e., pedagogical actions serve as interpretable, intervention-relevant proxies for learning states. This finding provides both theoretical grounding and empirical support for developing explainable, causally aware intelligent tutoring systems.

Technology Category

Application Category

๐Ÿ“ Abstract
Tutoring dialogues have gained significant attention in recent years, given the prominence of online learning and the emerging tutoring abilities of artificial intelligence (AI) agents powered by large language models (LLMs). Recent studies have shown that the strategies used by tutors can have significant effects on student outcomes, necessitating methods to predict how tutors will behave and how their actions impact students. However, few works have studied predicting tutor strategy in dialogues. Therefore, in this work we investigate the ability of modern LLMs, particularly Llama 3 and GPT-4o, to predict both future tutor moves and student outcomes in dialogues, using two math tutoring dialogue datasets. We find that even state-of-the-art LLMs struggle to predict future tutor strategy while tutor strategy is highly indicative of student outcomes, outlining a need for more powerful methods to approach this task.
Problem

Research questions and friction points this paper is trying to address.

Predict tutor strategies in educational dialogues
Assess impact of tutor moves on student outcomes
Evaluate LLMs for strategy and outcome prediction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Using LLMs to predict tutor strategies
Analyzing tutor-student dialogue datasets
Evaluating GPT-4o and Llama 3 performance
๐Ÿ”Ž Similar Papers
No similar papers found.
F
Fareya Ikram
University of Massachusetts Amherst
Alexander Scarlatos
Alexander Scarlatos
PhD Student, University of Massachusetts - Amherst
Natural Language ProcessingEducationReinforcement LearningMusic Generation
A
Andrew Lan
University of Massachusetts Amherst