Exploring LLMs for Predicting Tutor Strategy and Student Outcomes in Dialogues

📅 2025-07-09

📈 Citations: 0

✨ Influential: 0

career value

168K/year

🤖 AI Summary

This study investigates the prediction of teacher instructional strategies in mathematics tutoring dialogues and their causal impact on student learning outcomes. Using two real-world mathematics tutoring dialogue datasets, we conduct the first systematic evaluation of Llama-3 and GPT-4o in modeling multi-turn strategy evolution, complemented by causal inference to assess the association between teacher strategies and subsequent student performance. Results reveal that current large language models exhibit significant limitations in long-horizon strategy trajectory prediction. However, teacher strategies themselves strongly predict downstream learning gains (AUC > 0.82), and this predictive power remains robust even after controlling for dialogue history. Our work establishes the critical insight that “strategy is a signal”—i.e., pedagogical actions serve as interpretable, intervention-relevant proxies for learning states. This finding provides both theoretical grounding and empirical support for developing explainable, causally aware intelligent tutoring systems.

Technology Category

Application Category

📝 Abstract

Tutoring dialogues have gained significant attention in recent years, given the prominence of online learning and the emerging tutoring abilities of artificial intelligence (AI) agents powered by large language models (LLMs). Recent studies have shown that the strategies used by tutors can have significant effects on student outcomes, necessitating methods to predict how tutors will behave and how their actions impact students. However, few works have studied predicting tutor strategy in dialogues. Therefore, in this work we investigate the ability of modern LLMs, particularly Llama 3 and GPT-4o, to predict both future tutor moves and student outcomes in dialogues, using two math tutoring dialogue datasets. We find that even state-of-the-art LLMs struggle to predict future tutor strategy while tutor strategy is highly indicative of student outcomes, outlining a need for more powerful methods to approach this task.

Problem

Research questions and friction points this paper is trying to address.

Predict tutor strategies in educational dialogues

Assess impact of tutor moves on student outcomes

Evaluate LLMs for strategy and outcome prediction

Innovation

Methods, ideas, or system contributions that make the work stand out.

Using LLMs to predict tutor strategies

Analyzing tutor-student dialogue datasets

Evaluating GPT-4o and Llama 3 performance

🔎 Similar Papers

Towards the Pedagogical Steering of Large Language Models for Tutoring: A Case Study with Modeling Productive Failure