🤖 AI Summary
This study addresses the limited cognitive and linguistic authenticity of student simulations generated by current large language models under zero- or few-shot prompting, which hinders teachers’ accurate insight into student thinking. For the first time, it systematically compares three approaches—fine-tuning, multi-agent collaborative reasoning, and Direct Preference Optimization (DPO)—to develop high-fidelity student simulators in mathematical learning contexts, complemented by qualitative evaluation through educational interviews. Results indicate that all three methods significantly enhance simulation authenticity: multi-agent and DPO approaches explicitly articulate problem-solving rationales, facilitating pedagogical observation, whereas fine-tuned models produce concise responses that constrain elaboration of reasoning. The findings reveal critical trade-offs between authenticity and instructional utility across technical paradigms, offering a novel framework for designing AI-driven student simulations in teacher education.
📝 Abstract
Large Language Model (LLM) simulations, where LLMs act as students with varying approaches to learning tasks, can support teachers' noticing of student thinking. However, simulations using zero- or few-shot prompting often yield inauthentic knowledge and language, directing teachers to unrealistic reasoning. We evaluate three approaches (Fine-tuning, Multi-agent, and Direct Preference Optimization; DPO) to improve the authenticity and pedagogical utility of simulated students. All approaches improve cognitive and linguistic authenticity, compared with few-shot prompts. Interviews with elementary mathematics pre-service teachers and researchers (\textit{n} = 8) reveal distinct pedagogical affordances. The fine-tuned model produces realistic, brief responses but limits opportunities to extend students' thinking. Meanwhile, the multi-agent and DPO approaches generate explicit reasoning behind student strategies. We discuss implications for designing LLM simulations that balance authenticity with instructional utility for teacher learning.