Semantically-Equivalent Transformations-Based Backdoor Attacks against Neural Code Models: Characterization and Mitigation

📅 2025-12-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing neural code model backdoor defenses primarily target cleanable, injection-based attacks, leading to false security assurances. This work introduces, for the first time, a novel backdoor paradigm grounded in Semantic-Equivalent Transformations (SET): it generates stealthy triggers via infrequent yet semantically invariant code rewrites, effectively evading mainstream defenses. We propose the first automated framework for SET-trigger generation, integrating fine-grained code semantic analysis with formal program transformation rule modeling. Evaluated across CodeBERT, CodeT5, and StarCoder on five downstream tasks and six programming languages, our attacks achieve >90% success rates while reducing detection rates by an average of 25.13%; conventional normalization-based mitigation strategies prove only partially effective. This study uncovers a previously overlooked threat dimension—semantics-preserving backdoors—and substantially expands the frontier of adversarial vulnerability in code intelligence models.

Technology Category

Application Category

📝 Abstract
Neural code models have been increasingly incorporated into software development processes. However, their susceptibility to backdoor attacks presents a significant security risk. The state-of-the-art understanding focuses on injection-based attacks, which insert anomalous patterns into software code. These attacks can be neutralized by standard sanitization techniques. This status quo may lead to a false sense of security regarding backdoor attacks. In this paper, we introduce a new kind of backdoor attacks, dubbed Semantically-Equivalent Transformation (SET)-based backdoor attacks, which use semantics-preserving low-prevalence code transformations to generate stealthy triggers. We propose a framework to guide the generation of such triggers. Our experiments across five tasks, six languages, and models like CodeBERT, CodeT5, and StarCoder show that SET-based attacks achieve high success rates (often >90%) while preserving model utility. The attack proves highly stealthy, evading state-of-the-art defenses with detection rates on average over 25.13% lower than injection-based counterparts. We evaluate normalization-based countermeasures and find they offer only partial mitigation, confirming the attack's robustness. These results motivate further investigation into scalable defenses tailored to SET-based attacks.
Problem

Research questions and friction points this paper is trying to address.

Introduces stealthy backdoor attacks using semantics-preserving code transformations
Evaluates attack effectiveness across multiple programming languages and models
Assesses limitations of current defenses and motivates new mitigation strategies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces Semantically-Equivalent Transformation-based stealthy backdoor attacks
Proposes a framework to generate low-prevalence code triggers
Shows attacks evade defenses and require tailored mitigation
J
Junyao Ye
Huazhong University of Science and Technology, China
Z
Zhen Li
Huazhong University of Science and Technology, China
X
Xi Tang
Huazhong University of Science and Technology, China
Shouhuai Xu
Shouhuai Xu
Gallogly Chair Professor in Cybersecurity, University of Colorado Colorado Springs
Cyber ResilienceCybersecurity DynamicsCybersecurity MetricsCybersecurity AnalyticsCrypto
D
Deqing Zou
Huazhong University of Science and Technology, China
Z
Zhongsheng Yuan
Huazhong University of Science and Technology, China