AutoPatch: Multi-Agent Framework for Patching Real-World CVE Vulnerabilities

📅 2025-05-07

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

To address the problem that large language models (LLMs) generate vulnerable code due to knowledge cutoff—introducing novel CVEs into real-world systems—this paper proposes a multi-agent framework for automated high-risk vulnerability identification, validation, and patching in production-grade software such as the Linux kernel and Chrome. Our method integrates semantic matching with taint analysis for precise CVE matching, introduces an enhanced chain-of-thought (CoT) prompting technique enabling complex vulnerability reasoning without explicit error localization, and constructs a structured RAG knowledge base comprising 525 real-world vulnerability snippets. Experiments demonstrate a 90.4% CVE matching accuracy, 89.5% F1-score for vulnerability validation, and 95.0% patch correctness rate, while reducing computational cost by over 50× compared to full fine-tuning.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) have emerged as promising tools in software development, enabling automated code generation and analysis. However, their knowledge is limited to a fixed cutoff date, making them prone to generating code vulnerable to newly disclosed CVEs. Frequent fine-tuning with new CVE sets is costly, and existing LLM-based approaches focus on oversimplified CWE examples and require providing explicit bug locations to LLMs, limiting their ability to patch complex real-world vulnerabilities. To address these limitations, we propose AutoPatch, a multi-agent framework designed to patch vulnerable LLM-generated code, particularly those introduced after the LLMs' knowledge cutoff. AutoPatch integrates Retrieval-Augmented Generation (RAG) with a structured database of recently disclosed vulnerabilities, comprising 525 code snippets derived from 75 high-severity CVEs across real-world systems such as the Linux kernel and Chrome. AutoPatch combines semantic and taint analysis to identify the most relevant CVE and leverages enhanced Chain-of-Thought (CoT) reasoning to construct enriched prompts for verification and patching. Our unified similarity model, which selects the most relevant vulnerabilities, achieves 90.4 percent accuracy in CVE matching. AutoPatch attains 89.5 percent F1-score for vulnerability verification and 95.0 percent accuracy in patching, while being over 50x more cost-efficient than traditional fine-tuning approaches.

Problem

Research questions and friction points this paper is trying to address.

LLMs generate code vulnerable to new CVEs due to outdated knowledge

Existing methods require explicit bug locations and handle only simple cases

AutoPatch patches real-world vulnerabilities using RAG and multi-agent analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent framework for patching CVE vulnerabilities

Integrates RAG with structured CVE database

Combines semantic and taint analysis for accuracy

🔎 Similar Papers

APPATCH: Automated Adaptive Prompting Large Language Models for Real-World Software Vulnerability Patching