Reasoning Scaffolding: Distilling the Flow of Thought from LLMs

📅 2025-09-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing LLM-to-SLM distillation approaches predominantly rely on text-level behavioral cloning, imitating only superficial reasoning traces while neglecting underlying algorithmic structures—leading to poor logical robustness. This paper introduces the “Reasoning Scaffold” framework, which models reasoning as an interpretable sequence of semantic signals and employs multi-task learning to jointly predict reasoning flow and stepwise generation, thereby enabling algorithm-level transfer of cognitive structure. Our method integrates semantic signal modeling, structured generation, and phased supervision to significantly enhance the logical coherence and generalization capability of student models. Evaluated on GSM8K, MATH, and Algorithmic Reasoning benchmarks, our approach consistently outperforms prior distillation methods, achieving an average accuracy gain of 8.2% and a 37% reduction in erroneous reasoning paths. It is the first work to systematically realize interpretable modeling and effective transfer of algorithmic reasoning structures within SLMs.

Technology Category

Application Category

📝 Abstract
The prevailing approach to distilling reasoning from Large Language Models (LLMs)-behavioral cloning from textual rationales-is fundamentally limited. It teaches Small Language Models (SLMs) to mimic surface-level patterns rather than the underlying algorithmic structure of thought, resulting in a critical lack of logical robustness. We argue that instead of cloning text, distillation should transfer this algorithmic structure directly. We introduce Reasoning Scaffolding}, a framework that reframes reasoning as a structured generation process. Our method first abstracts the teacher's thought process into a sequence of discrete, interpretable semantic signals (e.g., Contrast, Addition) that act as a scaffold. The student model is then trained via a multi-task objective to both (1)predict the next semantic signal, anticipating the reasoning flow, and (2)generate the corresponding step, conditioned on that signal. This multi-task scheme acts as a powerful regularizer, compelling the student to internalize the computational patterns of coherent reasoning. On a suite of challenging reasoning benchmarks, our method significantly outperforms state-of-the-art distillation in both accuracy and logical consistency, providing a path towards creating smaller models that are genuine reasoners, not just fluent mimics.
Problem

Research questions and friction points this paper is trying to address.

Distill algorithmic reasoning structure from LLMs to SLMs
Improve logical robustness in small language models
Replace behavioral cloning with structured reasoning transfer
Innovation

Methods, ideas, or system contributions that make the work stand out.

Distills algorithmic thought structure from LLMs
Uses semantic signals as reasoning scaffold
Trains student model with multi-task objective
🔎 Similar Papers
No similar papers found.
Xiangyu Wen
Xiangyu Wen
Computer Science and Engineering, CUHK
AI SecurityLLM Reasoning
J
Junhua Huang
HUAWEI Noah’s Ark Lab
Z
Zeju Li
The Chinese University of Hong Kong
M
Min Li
Southeast University
Jianyuan Zhong
Jianyuan Zhong
The Chinese University of Hong Kong
Machine Learning
Zhijian Xu
Zhijian Xu
University of Science and Technology of China
Natural Language Processing
M
Mingxuan Yuan
HUAWEI Noah’s Ark Lab
Y
Yongxiang Huang
HUAWEI Hong Kong Research Center
Q
Qiang Xu
The Chinese University of Hong Kong