LLMs Are Already Good Tutors: Training-Free Prompt Optimization for Pedagogical Math Tutoring

πŸ“… 2026-05-26
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the high cost and complexity of conventional large language models (LLMs) in mathematical tutoring, which typically rely on expensive multi-GPU reinforcement learning (RL) training. Instead, the authors propose a training-free API-based paradigm that achieves efficient and pedagogically aligned tutoring through systematic prompt optimization. They introduce five novel education-specific prompt engineering methods and integrate them with seven existing strategies to form a comprehensive 12-method prompt evolution framework. Evaluated via an 82-dimensional educational behavior encoding scheme, all proposed methods surpass the strongest RL baseline (R_total = 0.633). Notably, the newly developed ParetoGrad method achieves Pareto optimality among problem-solving success rate, prevention of answer leakage, and instructional helpfulness, substantially enhancing the model’s capacity for applying pedagogical knowledge.
πŸ“ Abstract
Aligning LLMs for math tutoring typically requires RL-based training with multi-GPU infrastructure. We investigate whether training-free prompt optimization-evolving only the system prompt via API calls-can serve as a practical alternative. We adapt 7 published methods and propose 5 education-specialized methods, evaluating these 12 methods under 5 conditions on 2 OOD benchmark suites. All 12 best-per-method configurations surpass the strongest RL-trained baseline (R_total = 0.633), and our ParetoGrad achieves the best Pareto balance across post-test solve rate, leak control, and helpfulness, rather than dominating any single component. Behavioral analysis with an 82-code educational codebook reveals that training-free methods rely on teaching-knowledge patterns at 2-3x the rate of RL-trained models, with a compensating ~10 percentage-point reduction in intent-level scaffolding. We also find a task-dependent reasoning mode effect consistent across training-free and RL-based paradigms. Our approach enables efficient development of pedagogically aligned LLM tutors with prompts alone and minimal compute.
Problem

Research questions and friction points this paper is trying to address.

math tutoring
training-free
prompt optimization
pedagogical alignment
large language models
Innovation

Methods, ideas, or system contributions that make the work stand out.

training-free prompting
prompt optimization
pedagogical alignment
math tutoring
ParetoGrad
πŸ”Ž Similar Papers
No similar papers found.