GUARD:Dual-Agent based Backdoor Defense on Chain-of-Thought in Neural Code Generation

๐Ÿ“… 2025-05-27
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Chain-of-thought (CoT)-enhanced code generation models are vulnerable to backdoor attacks, and existing defenses fail to mitigate them effectively. To address this, we propose GUARD, a dual-agent collaborative defense framework. Its contributions are twofold: (1) GUARD-Judge performs fine-grained anomaly detection by analyzing CoT steps across multiple dimensions and identifying trigger patterns; (2) GUARD-Repair employs retrieval-augmented generation (RAG) combined with adversarial reasoning verification to semantically consistent repair of suspicious reasoning steps. Evaluated on multiple code generation benchmarks, GUARD reduces backdoor attack success rates to <1.2% while incurring only a 0.8% drop in Pass@1โ€”significantly outperforming state-of-the-art defenses. To our knowledge, GUARD is the first approach to jointly guarantee both high security against backdoors and high code generation quality.

Technology Category

Application Category

๐Ÿ“ Abstract
With the widespread application of large language models in code generation, recent studies demonstrate that employing additional Chain-of-Thought generation models can significantly enhance code generation performance by providing explicit reasoning steps. However, as external components, CoT models are particularly vulnerable to backdoor attacks, which existing defense mechanisms often fail to detect effectively. To address this challenge, we propose GUARD, a novel dual-agent defense framework specifically designed to counter CoT backdoor attacks in neural code generation. GUARD integrates two core components: GUARD-Judge, which identifies suspicious CoT steps and potential triggers through comprehensive analysis, and GUARD-Repair, which employs a retrieval-augmented generation approach to regenerate secure CoT steps for identified anomalies. Experimental results show that GUARD effectively mitigates attacks while maintaining generation quality, advancing secure code generation systems.
Problem

Research questions and friction points this paper is trying to address.

Defending against backdoor attacks in Chain-of-Thought code generation
Detecting and repairing vulnerable CoT steps in neural code generation
Ensuring secure code generation without compromising performance quality
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-agent framework for CoT backdoor defense
GUARD-Judge detects suspicious steps and triggers
GUARD-Repair regenerates secure CoT steps
๐Ÿ”Ž Similar Papers
No similar papers found.
N
Naizhu Jin
State Key Laboratory for Novel Software Technology ,Nanjing University , China
Z
Zhong Li
State Key Laboratory for Novel Software Technology ,Nanjing University , China
T
Tian Zhang
State Key Laboratory for Novel Software Technology ,Nanjing University , China
Qingkai Zeng
Qingkai Zeng
Assistant Professor, Nankai University; University of Notre Dame
data miningnatural language processingknowledge graphlarge language models