🤖 AI Summary
Current large language models (LLMs) lack pedagogical strategy capabilities—such as delaying answer provision and orchestrating multi-turn instructional dialogues—in one-on-one mathematics tutoring.
Method: This paper introduces *Pedagogical Steering*, the first framework to formalize teaching strategies as verifiable state-transition graphs, coupled with StratL, a novel prompt optimization algorithm enabling controllable guidance of LLM teaching behavior. Focusing on *Productive Failure* (PF), a well-established constructivist pedagogy, we implement a dialogue-based mathematics tutoring prototype.
Contribution/Results: In a field study with 17 high school students in Singapore, our system successfully enforced strict adherence to the PF protocol by the LLM. We publicly release both the curated PF problem dataset and the implementation code. This work establishes a methodology for LLM-augmented personalized instruction that is interpretable, reproducible, and extensible.
📝 Abstract
One-to-one tutoring is one of the most efficient methods of teaching. With the growing popularity of Large Language Models (LLMs), there have been efforts to create LLM based conversational tutors which can expand the benefits of one to one tutoring to everyone. However, current LLMs are trained primarily to be helpful assistants and lack crucial pedagogical skills. For example, they often quickly reveal the solution to the student and fail to plan for a richer multi turn pedagogical interaction. To use LLMs in pedagogical settings, they need to be steered to use effective teaching strategies: a problem we introduce as Pedagogical Steering. We develop StratL, an algorithm to optimize LLM prompts and steer it to follow a predefined multi-turn tutoring plan represented as a transition graph. As a case study, we create a prototype tutor for high school math following Productive Failure (PF), an advanced and effective learning design. To validate our approach in a real-world setting, we run a field study with 17 high school students in Singapore and show that StratL succeeds in steering the LLM to follow the PF tutoring strategy. Finally, we highlight challenges in Pedagogical Steering of LLMs and offer opportunities for further improvements by publishing a dataset of PF problems and our code.