KNighter: Transforming Static Analysis with LLM-Synthesized Checkers

📅 2025-03-12

📈 Citations: 0

✨ Influential: 0

career value

144K/year

🤖 AI Summary

To address the low efficiency, narrow coverage, and poor generalizability of manually written checkers in static analysis—particularly to previously unseen defect patterns—this paper proposes the first LLM-driven automated checker synthesis framework guided by historical patch knowledge. Methodologically, it introduces a multi-stage synthesis pipeline integrating patch-informed prompt engineering, formal correctness verification, and automated false-positive refinement, enabling verifiable, iterative, and traceable checker generation. Empirical evaluation on the Linux kernel demonstrates that the synthesized checkers achieve high precision and strong generalization: they identified 70 previously unknown vulnerabilities/defects, of which 56 were confirmed, 41 fixed, and 11 assigned CVE identifiers. This work significantly extends the capability boundary and practical utility of static analysis.

Technology Category

Application Category

📝 Abstract

Static analysis is a powerful technique for bug detection in critical systems like operating system kernels. However, designing and implementing static analyzers is challenging, time-consuming, and typically limited to predefined bug patterns. While large language models (LLMs) have shown promise for static analysis, directly applying them to scan large codebases remains impractical due to computational constraints and contextual limitations. We present KNighter, the first approach that unlocks practical LLM-based static analysis by automatically synthesizing static analyzers from historical bug patterns. Rather than using LLMs to directly analyze massive codebases, our key insight is leveraging LLMs to generate specialized static analyzers guided by historical patch knowledge. KNighter implements this vision through a multi-stage synthesis pipeline that validates checker correctness against original patches and employs an automated refinement process to iteratively reduce false positives. Our evaluation on the Linux kernel demonstrates that KNighter generates high-precision checkers capable of detecting diverse bug patterns overlooked by existing human-written analyzers. To date, KNighter-synthesized checkers have discovered 70 new bugs/vulnerabilities in the Linux kernel, with 56 confirmed and 41 already fixed. 11 of these findings have been assigned CVE numbers. This work establishes an entirely new paradigm for scalable, reliable, and traceable LLM-based static analysis for real-world systems via checker synthesis.

Problem

Research questions and friction points this paper is trying to address.

Automates static analyzer synthesis using LLMs

Detects diverse bug patterns in large codebases

Reduces false positives through iterative refinement

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-synthesized static analyzers from historical bugs

Multi-stage synthesis pipeline for checker validation

Automated refinement to reduce false positives

🔎 Similar Papers

LLM-Assisted Static Analysis for Detecting Security Vulnerabilities