Simulating LLM-to-LLM Tutoring for Multilingual Math Feedback

📅 2025-06-05

📈 Citations: 0

✨ Influential: 0

career value

166K/year

🤖 AI Summary

This study addresses the gap in formative feedback research for mathematical reasoning instruction in non-English, multilingual educational contexts. Method: We propose a novel LLM-to-LLM pedagogical simulation paradigm: strong LLMs act as “teachers” generating linguistically adapted prompts, while weaker LLMs serve as “students” performing step-by-step reasoning—covering 11 languages, including seven low-resource ones. Through 352 controlled experiments, cross-lingual input–feedback pairing, and standardized evaluation, we assess instructional efficacy. Contribution/Results: Native-language-aligned feedback significantly enhances learning outcomes, yielding an average 19.3% accuracy gain in low-resource language settings. We further introduce the first analytical framework jointly modeling linguistic properties, model capabilities, and prompting strategies, empirically confirming that language resource availability and model–task alignment are critical moderators of educational effectiveness.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) have demonstrated the ability to generate formative feedback and instructional hints in English, making them increasingly relevant for AI-assisted education. However, their ability to provide effective instructional support across different languages, especially for mathematically grounded reasoning tasks, remains largely unexamined. In this work, we present the first large-scale simulation of multilingual tutor-student interactions using LLMs. A stronger model plays the role of the tutor, generating feedback in the form of hints, while a weaker model simulates the student. We explore 352 experimental settings across 11 typologically diverse languages, four state-of-the-art LLMs, and multiple prompting strategies to assess whether language-specific feedback leads to measurable learning gains. Our study examines how student input language, teacher feedback language, model choice, and language resource level jointly influence performance. Results show that multilingual hints can significantly improve learning outcomes, particularly in low-resource languages when feedback is aligned with the student's native language. These findings offer practical insights for developing multilingual, LLM-based educational tools that are both effective and inclusive.

Problem

Research questions and friction points this paper is trying to address.

Assessing LLMs' ability to provide multilingual math feedback

Exploring tutor-student interactions across 11 diverse languages

Evaluating impact of language-specific hints on learning gains

Innovation

Methods, ideas, or system contributions that make the work stand out.

Simulate multilingual tutor-student interactions using LLMs

Explore diverse languages and prompting strategies

Align feedback language with student's native language

🔎 Similar Papers

Towards the Pedagogical Steering of Large Language Models for Tutoring: A Case Study with Modeling Productive Failure