Memorize Theorems, Not Instances: Probing SFT Generalization through Mathematical Reasoning

📅 2026-05-09

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

This work addresses the fragile generalization of supervised fine-tuning (SFT) in mathematical reasoning, which often stems from models forming spurious surface-level associations between problems and memorized answers. To mitigate this, the authors propose Theorem-SFT, a novel approach that shifts the supervision signal from final answers to explicit theorem applications, thereby encouraging models to learn rule-based reasoning rather than instance memorization. Theoretical analysis reveals that generalization failures arise not from memory mechanisms per se, but from misaligned inductive objectives, with feedforward layers serving as the primary carriers of reasoning rules. By fine-tuning only the MLP layers and applying Theorem-SFT to multimodal large language models such as LLaMA3.2 and Qwen2.5-VL, the method achieves significant performance gains—8.8% on MATH and 20.27% on GeoQA—without requiring modality-specific retraining.

📝 Abstract

Supervised Fine-Tuning (SFT) is widely used for task-specific adaptation, yet recent work shows it systematically undermines reasoning generalization. We argue the root cause is not memorization itself, but its target: vanilla SFT drives models to exploit and memorize spurious surface correlations in problem-solution pairs, leaving them brittle to superficial input variations. To address this, we propose Theorem-SFT, which reorients supervision toward explicit theorem application by teaching models how rules are invoked rather than what answers look like. Theorem-SFT yields consistent gains across benchmarks and model families: +8.8% on MATH (LLaMA3.2-3B-Instruct) and +20.27% on GeoQA (Qwen2.5-VL-7B-Instruct) without modality-specific re-training. Fine-tuning MLP layers alone matches full-layers performance, implicating feed-forward components as the primary locus of reasoning rules. Our findings reframe the debate: Generalization failures stem not from memorization as a mechanism, but from memorizing the wrong inductive targets.

Problem

Research questions and friction points this paper is trying to address.

Supervised Fine-Tuning

reasoning generalization

memorization

mathematical reasoning

surface correlations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Theorem-SFT

reasoning generalization

supervised fine-tuning