🤖 AI Summary
This work addresses the high computational cost of existing warehouse-scale code reasoning methods, which struggle to balance accuracy and context retrieval expense due to exhaustive full-codebase exploration. To overcome this limitation, we propose a novel “recon-then-reason” paradigm that first performs lightweight structural reconnaissance using a semantic-structural code map to precisely locate relevant code targets, thereby avoiding full-code loading. Subsequently, a cost-aware strategy combined with dependency tracking tools efficiently constructs high-value context in a single step. Evaluated across multiple benchmarks—including SWE-QA, LongCodeQA, LOC-BENCH, and GitTaskBench—our approach significantly outperforms current state-of-the-art methods, achieving higher reasoning accuracy while substantially reducing token consumption.
📝 Abstract
Repository-scale code reasoning is a cornerstone of modern AI-assisted software engineering, enabling Large Language Models (LLMs) to handle complex workflows from program comprehension to complex debugging. However, balancing accuracy with context cost remains a significant bottleneck, as existing agentic approaches often waste computational resources through inefficient, iterative full-text exploration. To address this, we introduce \model, a framework that decouples repository exploration from content consumption. \model\ utilizes a structural scouting mechanism to navigate a lightweight semantic-structural map of the codebase, allowing the system to trace dependencies and pinpoint relevant targets without the overhead of full-text ingestion. By leveraging structure-aware navigation tools regulated by a cost-aware policy, the framework constructs high-value contexts in a single, optimized step. Extensive evaluations on the SWE-QA, LongCodeQA, LOC-BENCH, and GitTaskBench benchmarks demonstrate that \model\ consistently outperforms state-of-the-art baselines in reasoning accuracy while significantly reducing token consumption, validating the efficiency of scouting-first strategies for large-scale code reasoning. Source code is available at https://github.com/HKUDS/FastCode.