🤖 AI Summary
Large language models often suffer from “reasoning collapse” in complex mathematical reasoning and long-horizon planning, degenerating into low-rank bias manifolds and failing to explore high-value solution spaces. To address this, this work proposes the Spectral Orthogonal Exploration (SOE) framework, which introduces a novel “student-guided teacher” paradigm. SOE leverages weaker auxiliary agents as orthogonal probes to explicitly navigate the null space of a stronger model, thereby transcending the limitations of conventional imitation learning from a geometric perspective and eliciting semantically diverse yet high-value reasoning trajectories. Evaluated on mathematical reasoning benchmarks, SOE achieves a 62.4% average improvement in accuracy and a 113.7% gain in sampling efficiency over baseline methods.
📝 Abstract
While Large Language Models (LLMs) demonstrate near-human capabilities, they often suffer from"Reasoning Collapse"in complex mathematical proving and long-horizon planning. Models tend to degenerate into low-rank Bias Manifold, where stochastic sampling merely produces lexical variations of erroneous logic rather than semantic exploration. This geometric collapse renders the model"blind"to high-value solutions that lie within its Null Space. To address this, we propose Spectral Orthogonal Exploration (SOE), a geometric framework operating on a counter-intuitive"Student Guides Teacher"paradigm. Specifically, we utilize a weak auxiliary agent not for imitation, but as an orthogonal probe. By explicitly navigating the Teacher's Null Space, SOE serves as a geometric bridge, effectively ejecting the model from local optima to explore diverse, high-value solution spaces. Experiments on mathematical benchmarks demonstrate that, relative to baseline methods, our approach improves average accuracy by 62.4% and increases average sampling efficiency by 113.7%, indicating a promising path toward overcoming performance plateaus in advanced reasoning tasks.