Memorize Theorems, Not Instances: Probing SFT Generalization through Mathematical Reasoning

📅 2026-05-09
📈 Citations: 0
Influential: 0
📄 PDF

career value

174K/year
🤖 AI Summary
This work addresses the fragile generalization of supervised fine-tuning (SFT) in mathematical reasoning, which often stems from models forming spurious surface-level associations between problems and memorized answers. To mitigate this, the authors propose Theorem-SFT, a novel approach that shifts the supervision signal from final answers to explicit theorem applications, thereby encouraging models to learn rule-based reasoning rather than instance memorization. Theoretical analysis reveals that generalization failures arise not from memory mechanisms per se, but from misaligned inductive objectives, with feedforward layers serving as the primary carriers of reasoning rules. By fine-tuning only the MLP layers and applying Theorem-SFT to multimodal large language models such as LLaMA3.2 and Qwen2.5-VL, the method achieves significant performance gains—8.8% on MATH and 20.27% on GeoQA—without requiring modality-specific retraining.
📝 Abstract
Supervised Fine-Tuning (SFT) is widely used for task-specific adaptation, yet recent work shows it systematically undermines reasoning generalization. We argue the root cause is not memorization itself, but its target: vanilla SFT drives models to exploit and memorize spurious surface correlations in problem-solution pairs, leaving them brittle to superficial input variations. To address this, we propose Theorem-SFT, which reorients supervision toward explicit theorem application by teaching models how rules are invoked rather than what answers look like. Theorem-SFT yields consistent gains across benchmarks and model families: +8.8% on MATH (LLaMA3.2-3B-Instruct) and +20.27% on GeoQA (Qwen2.5-VL-7B-Instruct) without modality-specific re-training. Fine-tuning MLP layers alone matches full-layers performance, implicating feed-forward components as the primary locus of reasoning rules. Our findings reframe the debate: Generalization failures stem not from memorization as a mechanism, but from memorizing the wrong inductive targets.
Problem

Research questions and friction points this paper is trying to address.

Supervised Fine-Tuning
reasoning generalization
memorization
mathematical reasoning
surface correlations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Theorem-SFT
reasoning generalization
supervised fine-tuning
mathematical reasoning
inductive bias
🔎 Similar Papers
No similar papers found.