TeachLM: Post-Training LLMs for Education Using Authentic Learning Data

📅 2025-10-06

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

Current large language models (LLMs) lack explicit modeling of authentic student learning processes, limiting their capacity for high-quality personalized instruction. To address this, we propose a parameter-efficient fine-tuning framework grounded in real-world one-on-one teacher–student dialogue data, yielding an education-specific LLM with pedagogical awareness. Our approach introduces a high-fidelity synthetic dialogue generation paradigm, leveraging anonymized large-scale classroom interaction data to model student behaviors; and establishes an automated multi-turn pedagogical dialogue evaluation protocol, overcoming the limitations of prompt engineering in capturing complex instructional strategies. Experiments demonstrate substantial improvements in teaching interaction quality: student speaking duration doubles, question diversity increases, and dialogue turn count rises by 50%, enabling more natural, cognitively adaptive responses aligned with individual learning rhythms.

Technology Category

Application Category

📝 Abstract

The promise of generative AI to revolutionize education is constrained by the pedagogical limits of large language models (LLMs). A major issue is the lack of access to high-quality training data that reflect the learning of actual students. Prompt engineering has emerged as a stopgap, but the ability of prompts to encode complex pedagogical strategies in rule-based natural language is inherently limited. To address this gap we introduce TeachLM - an LLM optimized for teaching through parameter-efficient fine-tuning of state-of-the-art models. TeachLM is trained on a dataset comprised of 100,000 hours of one-on-one, longitudinal student-tutor interactions maintained by Polygence, which underwent a rigorous anonymization process to protect privacy. We use parameter-efficient fine-tuning to develop an authentic student model that enables the generation of high-fidelity synthetic student-tutor dialogues. Building on this capability, we propose a novel multi-turn evaluation protocol that leverages synthetic dialogue generation to provide fast, scalable, and reproducible assessments of the dialogical capabilities of LLMs. Our evaluations demonstrate that fine-tuning on authentic learning data significantly improves conversational and pedagogical performance - doubling student talk time, improving questioning style, increasing dialogue turns by 50%, and greater personalization of instruction.

Problem

Research questions and friction points this paper is trying to address.

Addresses limited pedagogical capabilities in educational LLMs

Overcomes scarcity of authentic student learning data

Enhances dialogical teaching performance through specialized fine-tuning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Parameter-efficient fine-tuning of state-of-the-art LLMs

Training on anonymized longitudinal student-tutor interactions

Generating synthetic dialogues for multi-turn evaluation protocol

🔎 Similar Papers

No similar papers found.