Adversarial Bug Reports as a Security Risk in Language Model-Based Automated Program Repair

📅 2025-09-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work uncovers a novel adversarial vulnerability in Large Language Model (LLM)-based Automated Program Repair (APR) systems: attackers can craft semantically coherent yet maliciously misleading natural-language bug reports that induce APR systems to generate insecure patches, thereby compromising software supply-chain integrity. To address this, the authors propose the first adversarial attack paradigm specifically targeting APR systems and develop an automated framework for generating adversarial bug reports. Leveraging a human-in-the-loop methodology, they construct high-quality adversarial examples and systematically evaluate both pre-processing filters (e.g., LlamaGuard) and post-hoc detection techniques (e.g., GitHub Copilot’s safety mechanisms, CodeQL). Experiments show that 90% of adversarial reports successfully trigger insecure patches; the strongest pre-filter blocks only 47%, while post-hoc analysis achieves just 58% detection efficacy—revealing a fundamental structural asymmetry between attack and defense and exposing critical weaknesses in current APR security postures.

Technology Category

Application Category

📝 Abstract
Large Language Model (LLM) - based Automated Program Repair (APR) systems are increasingly integrated into modern software development workflows, offering automated patches in response to natural language bug reports. However, this reliance on untrusted user input introduces a novel and underexplored attack surface. In this paper, we investigate the security risks posed by adversarial bug reports -- realistic-looking issue submissions crafted to mislead APR systems into producing insecure or harmful code changes. We develop a comprehensive threat model and conduct an empirical study to evaluate the vulnerability of state-of-the-art APR systems to such attacks. Our demonstration comprises 51 adversarial bug reports generated across a spectrum of strategies, from manual curation to fully automated pipelines. We test these against leading APR model and assess both pre-repair defenses (e.g., LlamaGuard variants, PromptGuard variants, Granite-Guardian, and custom LLM filters) and post-repair detectors (GitHub Copilot, CodeQL). Our findings show that current defenses are insufficient: 90% of crafted bug reports triggered attacker-aligned patches. The best pre-repair filter blocked only 47%, while post-repair analysis-often requiring human oversight-was effective in just 58% of cases. To support scalable security testing, we introduce a prototype framework for automating the generation of adversarial bug reports. Our analysis exposes a structural asymmetry: generating adversarial inputs is inexpensive, while detecting or mitigating them remains costly and error-prone. We conclude with practical recommendations for improving the robustness of APR systems against adversarial misuse and highlight directions for future work on trustworthy automated repair.
Problem

Research questions and friction points this paper is trying to address.

Assessing security risks from adversarial bug reports in LLM-based APR systems
Evaluating vulnerability of APR systems to misleading bug reports
Testing effectiveness of pre-repair and post-repair defense mechanisms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Developed adversarial bug report threat model
Created automated adversarial report generation framework
Tested pre-repair and post-repair defense mechanisms