π€ AI Summary
Large language models (LLMs) in intelligent tutoring systems often generate pedagogically misaligned feedback, lacking explicit adherence to instructional strategies. Method: We propose a fine-grained pedagogical intent modeling framework to enhance response quality. Leveraging the MathDial dataset, we construct a novel annotation scheme covering 11 actionable, operationally defined teaching intentsβthe first such fine-grained, executable taxonomy. We design an automated intent annotation pipeline, train a pedagogically aligned model via supervised fine-tuning (SFT), and evaluate performance using both automated metrics and human qualitative analysis. Contribution/Results: Our fine-grained model significantly improves instructional appropriateness and scaffolding effectiveness of generated responses, outperforming four coarse-grained baselines across all dimensions. We publicly release the annotated dataset and source code, establishing a new paradigm and foundational resource for controllable generation and explainable AI in educational applications.
π Abstract
Large language models (LLMs) hold great promise for educational applications, particularly in intelligent tutoring systems. However, effective tutoring requires alignment with pedagogical strategies - something current LLMs lack without task-specific adaptation. In this work, we explore whether fine-grained annotation of teacher intents can improve the quality of LLM-generated tutoring responses. We focus on MathDial, a dialog dataset for math instruction, and apply an automated annotation framework to re-annotate a portion of the dataset using a detailed taxonomy of eleven pedagogical intents. We then fine-tune an LLM using these new annotations and compare its performance to models trained on the original four-category taxonomy. Both automatic and qualitative evaluations show that the fine-grained model produces more pedagogically aligned and effective responses. Our findings highlight the value of intent specificity for controlled text generation in educational settings, and we release our annotated data and code to facilitate further research.