A*-Thought: Efficient Reasoning via Bidirectional Compression for Low-Resource Settings

📅 2025-05-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large reasoning models (LRMs) suffer from low inference efficiency due to excessively long chains of thought, while existing compression methods—largely predicated on the “overthinking” assumption—often degrade reasoning quality. To address this, we propose TreeCompress, an efficient reasoning compression framework grounded in tree search: it formalizes inference as a cost-weighted search tree and introduces a novel compression paradigm integrating A* heuristic search with bidirectional importance estimation—combining forward confidence and backward influence—to precisely identify critical reasoning nodes and extract high-information-density paths. TreeCompress supports dynamic pruning and budget-aware optimization. Evaluated on multiple mathematical benchmarks, it achieves up to 2.39× speedup (under low computational budgets) and nearly 50% reduction in output tokens (under high budgets) for QwQ-32B, without compromising answer accuracy. Moreover, it generalizes across diverse LRMs, marking the first approach to jointly enhance both efficiency and reasoning fidelity.

Technology Category

Application Category

📝 Abstract
Large Reasoning Models (LRMs) achieve superior performance by extending the thought length. However, a lengthy thinking trajectory leads to reduced efficiency. Most of the existing methods are stuck in the assumption of overthinking and attempt to reason efficiently by compressing the Chain-of-Thought, but this often leads to performance degradation. To address this problem, we introduce A*-Thought, an efficient tree search-based unified framework designed to identify and isolate the most essential thoughts from the extensive reasoning chains produced by these models. It formulates the reasoning process of LRMs as a search tree, where each node represents a reasoning span in the giant reasoning space. By combining the A* search algorithm with a cost function specific to the reasoning path, it can efficiently compress the chain of thought and determine a reasoning path with high information density and low cost. In addition, we also propose a bidirectional importance estimation mechanism, which further refines this search process and enhances its efficiency beyond uniform sampling. Extensive experiments on several advanced math tasks show that A*-Thought effectively balances performance and efficiency over a huge search space. Specifically, A*-Thought can improve the performance of QwQ-32B by 2.39$ imes$ with low-budget and reduce the length of the output token by nearly 50% with high-budget. The proposed method is also compatible with several other LRMs, demonstrating its generalization capability. The code can be accessed at: https://github.com/AI9Stars/AStar-Thought.
Problem

Research questions and friction points this paper is trying to address.

Efficient reasoning compression in low-resource settings
Balancing performance and efficiency in large reasoning models
Bidirectional compression for high-density, low-cost reasoning paths
Innovation

Methods, ideas, or system contributions that make the work stand out.

A*-Thought framework for efficient reasoning
Bidirectional compression of reasoning chains
Tree search with cost function optimization
🔎 Similar Papers
No similar papers found.
X
Xiaoang Xu
Beijing University of Posts and Telecommunications
S
Shuo Wang
Dept. of Comp. Sci. & Tech., Tsinghua University, Beijing, China; Institute for AI, Tsinghua University, Beijing, China; Beijing National Research Center for Information Science and Technology
X
Xu Han
Dept. of Comp. Sci. & Tech., Tsinghua University, Beijing, China; Institute for AI, Tsinghua University, Beijing, China; Beijing National Research Center for Information Science and Technology
Zhenghao Liu
Zhenghao Liu
Northeastern University
NLPInformation Retrieval
H
Huijia Wu
Beijing University of Posts and Telecommunications
Peipei Li
Peipei Li
Beijing University of Posts and Telecommunications (BUPT)
Computer VisionImage SynthesisFace Recognition
Z
Zhiyuan Liu
Dept. of Comp. Sci. & Tech., Tsinghua University, Beijing, China; Institute for AI, Tsinghua University, Beijing, China; Beijing National Research Center for Information Science and Technology
Maosong Sun
Maosong Sun
Professor of Computer Science and Technology, Tsinghua University
Natural Language ProcessingArtificial IntelligenceSocial Computing
Z
Zhaofeng He
Beijing University of Posts and Telecommunications