Automated Static Vulnerability Detection via a Holistic Neuro-symbolic Approach

📅 2025-04-22

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

Static vulnerability detection has long suffered from the high cost and low automation of manually constructing vulnerability patterns. This paper proposes MoCQ, a neural-symbolic closed-loop framework that enables end-to-end automated generation of vulnerability patterns and optimization of detection queries. MoCQ leverages large language models (LLMs) to extract initial patterns from vulnerability samples, validates and refines them via symbolic static analysis, and iteratively improves query precision through a feedback-driven refinement loop. Evaluated across two programming languages and seven common vulnerability classes, MoCQ achieves precision and recall comparable to expert-crafted queries. It discovers 12 novel vulnerability patterns overlooked by domain experts and identifies 7 previously unknown real-world vulnerabilities, all responsibly disclosed.

Technology Category

Application Category

📝 Abstract

Static vulnerability detection is still a challenging problem and demands excessive human efforts, e.g., manual curation of good vulnerability patterns. None of prior works, including classic program analysis or Large Language Model (LLM)-based approaches, have fully automated such vulnerability pattern generations with reasonable detection accuracy. In this paper, we design and implement, MoCQ, a novel holistic neuro-symbolic framework that combines the complementary strengths of LLMs and classical static analysis to enable scalable vulnerability detection. The key insight is that MoCQ leverages an LLM to automatically extract vulnerability patterns and translate them into detection queries, and then on static analysis to refine such queries in a feedback loop and eventually execute them for analyzing large codebases and mining vulnerabilities. We evaluate MoCQ on seven types of vulnerabilities spanning two programming languages. We found MoCQ-generated queries uncovered at least 12 patterns that were missed by experts. On a ground truth dataset, MoCQ achieved comparable precision and recall compared to expert-crafted queries. Moreover, MoCQ has identified seven previously unknown vulnerabilities in real-world applications, demonstrating its practical effectiveness. We have responsibly disclosed them to the corresponding developers.

Problem

Research questions and friction points this paper is trying to address.

Automating vulnerability pattern generation for static detection

Combining LLMs and static analysis for accurate vulnerability detection

Improving scalability and accuracy in large codebase analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines LLMs and static analysis holistically

Automates vulnerability pattern extraction via LLM

Refines queries through feedback loop for accuracy

🔎 Similar Papers

No similar papers found.