The Hitchhiker's Guide to Program Analysis, Part II: Deep Thoughts by LLMs

📅 2025-04-16

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

Static analysis of large codebases like the Linux kernel suffers from high false-positive rates due to oversimplified program modeling and approximate constraint solving, severely limiting its practical utility for vulnerability detection. To address this, we propose BugLens—a novel structured LLM-powered framework that anchors large language models (LLMs) to *verifiable* program paths, data constraints, and security vulnerability patterns. Rather than performing direct program analysis—where LLMs are inherently unreliable—BugLens leverages static analysis toolchains, prompt-engineered LLM reasoning, security pattern matching, and formal constraint verification to assess the security impact and validity of static warnings. Evaluation on the Linux kernel demonstrates that BugLens achieves a precision of 0.72—representing a 72% improvement over baseline static analyzers—while uncovering four previously unknown vulnerabilities and substantially reducing false positives.

Technology Category

Application Category

📝 Abstract

Static analysis is a cornerstone for software vulnerability detection, yet it often struggles with the classic precision-scalability trade-off. In practice, such tools often produce high false positive rates, particularly in large codebases like the Linux kernel. This imprecision can arise from simplified vulnerability modeling and over-approximation of path and data constraints. While large language models (LLMs) show promise in code understanding, their naive application to program analysis yields unreliable results due to inherent reasoning limitations. We introduce BugLens, a post-refinement framework that significantly improves static analysis precision. BugLens guides an LLM to follow traditional analysis steps by assessing buggy code patterns for security impact and validating the constraints associated with static warnings. Evaluated on real-world Linux kernel bugs, BugLens raises precision from 0.10 (raw) and 0.50 (semi-automated refinement) to 0.72, substantially reducing false positives and revealing four previously unreported vulnerabilities. Our results suggest that a structured LLM-based workflow can meaningfully enhance the effectiveness of static analysis tools.

Problem

Research questions and friction points this paper is trying to address.

Improving precision of static analysis for vulnerability detection

Reducing high false positive rates in large codebases

Enhancing LLM reliability in program analysis workflows

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-guided static analysis refinement

Validates constraints of static warnings

Reduces false positives significantly

🔎 Similar Papers

SpecGen: Automated Generation of Formal Program Specifications via Large Language Models