🤖 AI Summary
To address challenges of fragmented codebase-level knowledge, low retrieval efficiency, and difficulty in private deployment, this paper proposes Key-Augmented Neural Triggers (KANT). KANT introduces learnable “knowledge anchors” into both LLM training and inference, enabling localized modeling of cross-file semantic fragments to mitigate attention saturation and contextual redundancy. The method integrates self-generated synthetic data construction, anchor embedding alignment, and a lightweight neural triggering mechanism—requiring no large-scale annotated code repositories or cloud-based services. Experiments demonstrate that KANT achieves a 79% preference rate in LocalStack expert evaluations, >60% overall human preference, and up to 85% reduction in inference latency. It supports low-overhead, high-coverage local deployment, significantly enhancing code intent understanding and repository-scale knowledge sharing.
📝 Abstract
Repository-level code comprehension and knowledge sharing remain core challenges in software engineering. Large language models (LLMs) have shown promise by generating explanations of program structure and logic. However, these approaches still face limitations: First, relevant knowledge is distributed across multiple files within a repository, aka semantic fragmentation. Second, retrieval inefficiency and attention saturation degrade performance in RAG pipelines, where long, unaligned contexts overwhelm attention. Third, repository specific training data is scarce and often outdated. Finally, proprietary LLMs hinder industrial adoption due to privacy and deployment constraints. To address these issues, we propose Key-Augmented Neural Triggers (KANT), a novel approach that embeds knowledge anchors into both training and inference. Unlike prior methods, KANT enables internal access to repository specific knowledge, reducing fragmentation and grounding inference in localized context. Moreover, we synthesize specialized data directly from code. At inference, knowledge anchors replace verbose context, reducing token overhead and latency while supporting efficient, on premise deployment. We evaluate KANT via: a qualitative human evaluation of the synthesized dataset's intent coverage and quality across five dimensions; compare against SOTA baselines across five qualitative dimensions and inference speed; and replication across different LLMs to assess generalizability. Results show that the synthetic training data aligned with information-seeking needs. KANT achieved over 60% preference from human annotators and a LocalStack expert (preferring 79% of cases). Also, KANT reduced inference latency by up to 85% across all models. Overall, it is well-suited for scalable, low-latency, on-premise deployments, providing a strong foundation for code comprehension.