Tailoring the Curriculum: Student-Centered Reasoning Distillation via Dynamic Data-Model Compatibility

πŸ“… 2026-05-27
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing reasoning distillation methods struggle to effectively transfer the complex reasoning capabilities of large language models due to a mismatch between training data and student model capacity. This work proposes a Data-Model Compatibility (DMC) metric that jointly evaluates data quality, relative difficulty, and the student model’s current capability to dynamically select appropriately matched data for curriculum-based distillation. DMC is the first quantifiable compatibility assessment mechanism that updates adaptively throughout training, substantially improving distillation efficiency. Experimental results demonstrate that the DMC-guided data selection strategy consistently outperforms baseline approaches across multiple student architectures and reasoning tasks, with DMC scores showing strong correlation to final distillation performance.
πŸ“ Abstract
Reasoning distillation transfers complex reasoning abilities from large language models (LLMs) to smaller ones, yet its success depends on how well the training data align with the student model. This paper introduces the Data-Model Compatibility (DMC) metric, which can be used to assess the suitability of a dataset for reasoning distillation on a student model. DMC provides an assessment by jointly considering data quality, relative difficulty, and student capability. We validated the effectiveness of DMC from two perspectives: (1) DMC exhibits a strong correlation with reasoning distillation performance; and (2) using DMC as the criterion for data selection leads to improved reasoning distillation performance. Both findings are consistently demonstrated across multiple student models and tasks. Moreover, since the DMC of each dataset dynamically changes during training, our experiments demonstrate that dynamically selecting datasets based on DMC can further enhance performance.
Problem

Research questions and friction points this paper is trying to address.

reasoning distillation
data-model compatibility
student-centered learning
dynamic data selection
large language models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Data-Model Compatibility
reasoning distillation
dynamic data selection
student-centered learning
large language models