Progressive Mastery: Customized Curriculum Learning with Guided Prompting for Mathematical Reasoning

📅 2025-06-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address low sample efficiency and rigid difficulty adaptation in large language models’ (LLMs) mathematical reasoning, this paper proposes a customized curriculum learning framework. Methodologically, it integrates curriculum learning, adaptive difficulty modeling, dynamic prompt engineering, supervised fine-tuning (SFT), and reinforcement learning (RL) into a unified optimization pipeline. Key contributions include: (1) the first model capability-driven adaptive difficulty assessment mechanism, which dynamically quantifies instance difficulty based on model performance; and (2) Guided Prompting—a novel dynamic prompt injection technique that strategically reduces cognitive load for high-difficulty samples while enabling their effective incorporation into training via feedback. Evaluated on five mainstream mathematical reasoning benchmarks, the framework consistently outperforms uniform sampling across both SFT and RL paradigms, achieving significant gains in sample utilization efficiency and final reasoning accuracy.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) have achieved remarkable performance across various reasoning tasks, yet post-training is constrained by inefficient sample utilization and inflexible difficulty samples processing. To address these limitations, we propose Customized Curriculum Learning (CCL), a novel framework with two key innovations. First, we introduce model-adaptive difficulty definition that customizes curriculum datasets based on each model's individual capabilities rather than using predefined difficulty metrics. Second, we develop"Guided Prompting,"which dynamically reduces sample difficulty through strategic hints, enabling effective utilization of challenging samples that would otherwise degrade performance. Comprehensive experiments on supervised fine-tuning and reinforcement learning demonstrate that CCL significantly outperforms uniform training approaches across five mathematical reasoning benchmarks, confirming its effectiveness across both paradigms in enhancing sample utilization and model performance.
Problem

Research questions and friction points this paper is trying to address.

Inefficient sample utilization in LLM post-training
Inflexible processing of difficulty samples in reasoning tasks
Need for adaptive difficulty definition in curriculum learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Customized curriculum learning for adaptive training
Model-adaptive difficulty definition for datasets
Guided Prompting dynamically reduces sample difficulty