🤖 AI Summary
This work proposes a tree-structured language modeling approach to overcome the limitations of conventional autoregressive models, which struggle with systematic divergent reasoning due to their sequential generation paradigm. By introducing special branching tokens, the model constructs and selectively expands multiple reasoning paths in parallel within a single forward pass. Crucially, it leverages complete search trees—including both successful and failed trajectories—as supervision signals, enabling the model to internalize effective exploration strategies without relying on external search mechanisms. Integrating tree-structured sequence modeling, shared-prefix optimization, and training based on full search trees, the method achieves significant performance gains on tasks requiring divergent thinking while maintaining efficient one-pass inference, thereby establishing a new scalable paradigm for reasoning-phase computation.
📝 Abstract
Language models generate reasoning sequentially, preventing them from decoupling irrelevant exploration paths during search. We introduce Tree-Structured Language Modeling (TSLM), which uses special tokens to encode branching structure, enabling models to generate and selectively expand multiple search paths within a single generation process. By training on complete search trees including both successful and failed attempts, TSLM learns to internalize systematic exploration without redundant recomputation of shared prefixes. TSLM achieves robust performance and superior inference efficiency by avoiding the multiple independent forward passes required by external search methods. These results suggest a new paradigm of inference-time scaling for robust reasoning, demonstrating that supervised learning on complete tree-structured traces provides an efficient alternative for developing systematic exploration capabilities in language models.