AI Agentic Vulnerability Injection And Transformation with Optimized Reasoning

📅 2025-08-28

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

To address the scarcity of vulnerability detection data and the heavy reliance of existing AI-based methods on high-quality labeled data, this paper proposes a multi-agent collaborative vulnerability injection framework. The framework integrates a function-level code understanding agent, static analysis tools, and retrieval-augmented generation (RAG) to enable context-aware, category-specific, and realistic vulnerability injection. It further introduces low-rank adaptation (LoRA) to improve fine-tuning efficiency while ensuring safety and controllability during injection. Experiments across three benchmarks comprising 116 C/C++ functions demonstrate a function-level vulnerability injection success rate of 89%–95%, substantially outperforming baseline approaches. The generated dataset exhibits high fidelity and comprehensive coverage across vulnerability categories, establishing a high-quality, scalable data foundation for training AI-driven vulnerability detection models.

Technology Category

Application Category

📝 Abstract

The increasing complexity of software systems and the sophistication of cyber-attacks have underscored the critical need for effective automated vulnerability detection and repair systems. Traditional methods, such as static program analysis, face significant challenges related to scalability, adaptability, and high false-positive and false-negative rates. AI-driven approaches, particularly those using machine learning and deep learning models, show promise but are heavily reliant on the quality and quantity of training data. This paper introduces a novel framework designed to automatically introduce realistic, category-specific vulnerabilities into secure C/C++ codebases to generate datasets. The proposed approach coordinates multiple AI agents that simulate expert reasoning, along with function agents and traditional code analysis tools. It leverages Retrieval-Augmented Generation for contextual grounding and employs Low-Rank approximation of weights for efficient model fine-tuning. Our experimental study on 116 code samples from three different benchmarks suggests that our approach outperforms other techniques with regard to dataset accuracy, achieving between 89% and 95% success rates in injecting vulnerabilities at function level.

Problem

Research questions and friction points this paper is trying to address.

Automated vulnerability detection faces scalability and accuracy challenges

AI methods require high-quality training data for effectiveness

Generating realistic vulnerability datasets for C/C++ codebases

Innovation

Methods, ideas, or system contributions that make the work stand out.

AI agents simulate expert reasoning for vulnerabilities

Retrieval-Augmented Generation provides contextual grounding

Low-Rank approximation enables efficient model fine-tuning

🔎 Similar Papers

No similar papers found.

💼 Related Jobs

Research Scientist, Agent Robustness

Scale AI

$216,000—$270,000 USD

San Francisco, New York, Seattle

Authors to Follow