Student Guides Teacher: Weak-to-Strong Inference via Spectral Orthogonal Exploration

📅 2026-01-06

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

Large language models often suffer from “reasoning collapse” in complex mathematical reasoning and long-horizon planning, degenerating into low-rank bias manifolds and failing to explore high-value solution spaces. To address this, this work proposes the Spectral Orthogonal Exploration (SOE) framework, which introduces a novel “student-guided teacher” paradigm. SOE leverages weaker auxiliary agents as orthogonal probes to explicitly navigate the null space of a stronger model, thereby transcending the limitations of conventional imitation learning from a geometric perspective and eliciting semantically diverse yet high-value reasoning trajectories. Evaluated on mathematical reasoning benchmarks, SOE achieves a 62.4% average improvement in accuracy and a 113.7% gain in sampling efficiency over baseline methods.

Technology Category

Application Category

📝 Abstract

While Large Language Models (LLMs) demonstrate near-human capabilities, they often suffer from"Reasoning Collapse"in complex mathematical proving and long-horizon planning. Models tend to degenerate into low-rank Bias Manifold, where stochastic sampling merely produces lexical variations of erroneous logic rather than semantic exploration. This geometric collapse renders the model"blind"to high-value solutions that lie within its Null Space. To address this, we propose Spectral Orthogonal Exploration (SOE), a geometric framework operating on a counter-intuitive"Student Guides Teacher"paradigm. Specifically, we utilize a weak auxiliary agent not for imitation, but as an orthogonal probe. By explicitly navigating the Teacher's Null Space, SOE serves as a geometric bridge, effectively ejecting the model from local optima to explore diverse, high-value solution spaces. Experiments on mathematical benchmarks demonstrate that, relative to baseline methods, our approach improves average accuracy by 62.4% and increases average sampling efficiency by 113.7%, indicating a promising path toward overcoming performance plateaus in advanced reasoning tasks.

Problem

Research questions and friction points this paper is trying to address.

Reasoning Collapse

Bias Manifold

Null Space

Large Language Models

Mathematical Proving

Innovation

Methods, ideas, or system contributions that make the work stand out.

Spectral Orthogonal Exploration

Reasoning Collapse

Null Space Exploration