From Trace to Line: LLM Agent for Real-World OSS Vulnerability Localization

📅 2025-09-30

📈 Citations: 0

✨ Influential: 0

career value

235K/year

🤖 AI Summary

Existing LLM-based vulnerability detection methods primarily operate at the function- or file-level, lacking fine-grained line-level localization and executable patch generation capabilities, while also struggling with long-context modeling. To address these limitations, we propose T2L-Agent, an end-to-end framework featuring the first agent-based Trace Analysis Tracker (ATA). ATA integrates dynamic evidence—including runtime crash points, call stacks, and coverage changes—and leverages AST-driven code chunking, multi-turn LLM planning, and feedback refinement to progressively narrow down from module-level to precise vulnerable lines. We introduce T2L-ARVO, the first fine-grained benchmark for line-level vulnerability detection and repair, rigorously validated by domain experts. On T2L-ARVO, T2L-Agent achieves 58.0% vulnerability detection rate and 54.8% line-level localization accuracy—substantially outperforming state-of-the-art baselines. This work advances practical, precise vulnerability diagnosis and repair support for real-world open-source software using LLMs.

Technology Category

Application Category

📝 Abstract

Large language models show promise for vulnerability discovery, yet prevailing methods inspect code in isolation, struggle with long contexts, and focus on coarse function- or file-level detections - offering limited actionable guidance to engineers who need precise line-level localization and targeted patches in real-world software development. We present T2L-Agent (Trace-to-Line Agent), a project-level, end-to-end framework that plans its own analysis and progressively narrows scope from modules to exact vulnerable lines. T2L-Agent couples multi-round feedback with an Agentic Trace Analyzer (ATA) that fuses runtime evidence - crash points, stack traces, and coverage deltas - with AST-based code chunking, enabling iterative refinement beyond single pass predictions and translating symptoms into actionable, line-level diagnoses. To benchmark line-level vulnerability discovery, we introduce T2L-ARVO, a diverse, expert-verified 50-case benchmark spanning five crash families and real-world projects. T2L-ARVO is specifically designed to support both coarse-grained detection and fine-grained localization, enabling rigorous evaluation of systems that aim to move beyond file-level predictions. On T2L-ARVO, T2L-Agent achieves up to 58.0% detection and 54.8% line-level localization, substantially outperforming baselines. Together, the framework and benchmark push LLM-based vulnerability detection from coarse identification toward deployable, robust, precision diagnostics that reduce noise and accelerate patching in open-source software workflows.

Problem

Research questions and friction points this paper is trying to address.

Precisely locating vulnerable code lines in real-world OSS

Overcoming limitations of coarse function-level vulnerability detection

Integrating runtime evidence with code analysis for accurate localization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-round feedback with runtime evidence fusion

Progressive narrowing from modules to vulnerable lines

Agentic Trace Analyzer enabling iterative refinement

🔎 Similar Papers

No similar papers found.