Enhancing Large Language Model Reasoning via Selective Critical Token Fine-Tuning

📅 2025-10-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Standard supervised fine-tuning (SFT) uniformly penalizes all tokens, degrading output diversity and generalization in mathematical reasoning. Method: We propose selective critical-token fine-tuning, which identifies sparse, causally critical tokens—those whose perturbation alters reasoning correctness—via counterfactual analysis, and applies gradient updates exclusively at these positions while preserving the original token distributions elsewhere to maintain diversity and robustness. The method integrates seamlessly into standard SFT and supports test-time sampling extensions and reinforcement learning initialization. Results: Experiments across three model families (5 models) and 11 mathematical reasoning benchmarks show that fine-tuning fewer than 12% of tokens consistently outperforms full SFT, while increasing output entropy and improving training stability.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) primarily rely on supervised fine-tuning (SFT) as a key method to adapt pre-trained models to domain-specific tasks such as mathematical reasoning. However, standard SFT uniformly penalizes all tokens, neglecting that only a small subset of critical tokens determines reasoning correctness. This uniform supervision often causes reduced output diversity and limited generalization. We propose Critical Token Fine-tuning (CFT), a simple yet effective approach that updates only tokens identified as functionally indispensable via counterfactual perturbations. By focusing gradient signals on these decisive reasoning steps while preserving the diversity of non-critical tokens, CFT can enhance both generation and diversity. Extensive experiments on five models across three families (Qwen, OLMo, LLaMA) and eleven mathematical reasoning benchmarks show that CFT, despite fine-tuning on less than 12% of tokens, consistently outperforms standard SFT. Moreover, CFT enables test-time scaling through improved sampling diversity and provides a stronger initialization for reinforcement learning, sustaining performance gains in later training stages while maintaining higher entropy for better exploration. These results highlight CFT as a practical and general framework for efficient and robust LLM fine-tuning.
Problem

Research questions and friction points this paper is trying to address.

Standard fine-tuning penalizes all tokens uniformly, reducing output diversity
Identifies critical tokens via counterfactual perturbations for focused optimization
Enhances mathematical reasoning performance while maintaining generation diversity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tunes only critical tokens via counterfactual perturbations
Focuses gradients on decisive reasoning steps
Enhances generation diversity while maintaining performance
Zhiwen Ruan
Zhiwen Ruan
Southern University of Science and Technology
NLPLLMs
Yixia Li
Yixia Li
Southern University of Science and Technology
Natural Language Processing
H
He Zhu
Peking University
Y
Yun Chen
Shanghai University of Finance and Economics
P
Peng Li
Tsinghua University
Y
Yang Liu
Tsinghua University
G
Guanhua Chen
Southern University of Science and Technology