🤖 AI Summary
Existing large language models (LLMs) exhibit poor discrimination between vulnerable and syntactically similar benign code patches in vulnerability detection (accuracy: 0.06–0.14) and lack causal understanding of vulnerability origins. To address this, we propose a knowledge-level retrieval-augmented generation (RAG) framework that shifts vulnerability detection from code representation to causal knowledge reasoning. Our method first constructs a multidimensional CVE knowledge base; then performs function-semantic-aware vector retrieval to precisely match vulnerability contexts; and finally employs multi-step prompting to guide LLMs in joint causal inference of root causes and remediation strategies. Evaluated on the PairVul benchmark, our approach achieves absolute improvements of 12.96% in accuracy and 110% in pairwise accuracy. Human evaluation shows detection accuracy rising from 0.60 to 0.77, significantly enhancing interpretability and practical utility.
📝 Abstract
Vulnerability detection is essential for software quality assurance. In recent years, deep learning models (especially large language models) have shown promise in vulnerability detection. In this work, we propose a novel LLM-based vulnerability detection technique Vul-RAG, which leverages knowledge-level retrieval-augmented generation (RAG) framework to detect vulnerability for the given code in three phases. First, Vul-RAG constructs a vulnerability knowledge base by extracting multi-dimension knowledge via LLMs from existing CVE instances; second, for a given code snippet, Vul-RAG} retrieves the relevant vulnerability knowledge from the constructed knowledge base based on functional semantics; third, Vul-RAG leverages LLMs to check the vulnerability of the given code snippet by reasoning the presence of vulnerability causes and fixing solutions of the retrieved vulnerability knowledge. Our evaluation of Vul-RAG on our constructed benchmark PairVul shows that Vul-RAG substantially outperforms all baselines by 12.96%/110% relative improvement in accuracy/pairwise-accuracy. In addition, our user study shows that the vulnerability knowledge generated by Vul-RAG can serve as high-quality explanations which can improve the manual detection accuracy from 0.60 to 0.77.