🤖 AI Summary
This work addresses the performance bottlenecks of existing Language Server Protocol (LSP)-based code parsing approaches in large-scale TypeScript projects, where frequent symbol queries hinder efficient function-level indexing. To overcome this limitation, we propose abcoder-ts-parser, which uniquely integrates the native capabilities of the TypeScript Compiler API directly into the indexing pipeline. By leveraging the compiler’s abstract syntax tree (AST), semantic information, and module resolution logic, our method constructs a graph-structured index—UniAST—that preserves call chains and dependency relationships. Experimental evaluation on three open-source projects, including one with up to 1.2 million lines of code, demonstrates that abcoder-ts-parser substantially outperforms conventional LSP-based architectures, achieving significant improvements in both indexing efficiency and reliability.
📝 Abstract
Graph-based code indexing can improve context retrieval for LLM-based code agents by preserving call chains and dependency relationships that keyword search and similarity retrieval often miss. ABCoder is an open-source framework that parses codebases into a function-level code index called UniAST, but its existing parsers combine lightweight AST parsers for syntactic analysis with language servers for semantic resolution, but because LSP-based resolution requires a JSON-RPC call for each symbol lookup, these per-symbol calls become a bottleneck on large TypeScript repositories. We present abcoder-ts-parser, a TypeScript parser built on the TypeScript Compiler API that works directly with the compiler's AST, semantic information, and module resolution logic. We evaluate the parser on three open-source TypeScript projects with up to 1.2 million lines of code and find that it produces reliable indexes significantly more efficiently than the existing architecture. For a live demonstration, watch: https://youtu.be/ryssr7ouvdE