Enhancing and Reporting Robustness Boundary of Neural Code Models for Intelligent Code Understanding

📅 2026-03-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Neural code models are vulnerable to adversarial attacks, yet existing defenses are often computationally expensive, lack theoretical guarantees, and require white-box access. This work proposes ENBECOME—the first lightweight, training-free defense framework applicable in black-box settings—which achieves randomized smoothing of decision boundaries by introducing semantic-preserving random perturbations to input code during inference. The method delivers both empirical and certifiably robust guarantees: on a defect detection task, it reduces attack success rates from 42.43% to 9.74% with only a 0.29% drop in accuracy and attains an average certified robustness radius of 1.63. ENBECOME thus establishes the first formal robustness guarantee for neural code models under training-agnostic, black-box conditions.

Technology Category

Application Category

📝 Abstract
With the development of deep learning, Neural Code Models (NCMs) such as CodeBERT and CodeLlama are widely used for code understanding tasks, including defect detection and code classification. However, recent studies have revealed that NCMs are vulnerable to adversarial examples, inputs with subtle perturbations that induce incorrect predictions while remaining difficult to detect. Existing defenses address this issue via data augmentation to empirically improve robustness, but they are costly, offer no theoretical robustness guarantees, and typically require white-box access to model internals, such as gradients. To address the above challenges, we propose ENBECOME, a novel black-box training-free and lightweight adversarial defense. ENBECOME is designed to both enhance empirical robustness and report certified robustness boundaries for NCMs. ENBECOME operates solely during inference, introducing random, semantics-preserving perturbations to input code snippets to smooth the NCM's decision boundaries. This smoothing enables ENBECOME to formally certify a robustness radius within which adversarial examples can never induce misclassification, a property known as certified robustness. We conduct comprehensive experiments across multiple NCM architectures and tasks. Results show that ENBECOME significantly reduces attack success rates while maintaining high accuracy. For example, in defect detection, it reduces the average ASR from 42.43% to 9.74% with only a 0.29% drop in accuracy. Results show that ENBECOME significantly reduces attack success rates while maintaining high accuracy. For example, in defect detection, it reduces the average ASR from 42.43% to 9.74% with only a 0.29% drop in accuracy. Furthermore, ENBECOME achieves an average certified robustness radius of 1.63, meaning that adversarial modifications to no more than 1.63 identifiers are provably ineffective.
Problem

Research questions and friction points this paper is trying to address.

Neural Code Models
adversarial examples
robustness
code understanding
certified robustness
Innovation

Methods, ideas, or system contributions that make the work stand out.

certified robustness
neural code models
adversarial defense
black-box
semantics-preserving perturbation
🔎 Similar Papers
No similar papers found.
T
Tingxu Han
State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210093, China
W
Wei Song
School of Computer Science and Engineering, University of New South Wales, New South Wales 2052, Australia
Weisong Sun
Weisong Sun
Nanyang Technological University
Trustworthy Intelligent SE (Software Engineering)
H
Hao Wu
School of Computer Science and Technology, Soochow University, Suzhou 215006, China
Chunrong Fang
Chunrong Fang
Software Institute, Nanjing University
Software TestingSoftware EngineeringComputer Science
Yuan Xiao
Yuan Xiao
ShanghaiTech University
Computer Security
X
Xiaofang Zhang
School of Computer Science and Technology, Soochow University, Suzhou 215006, China
Zhenyu Chen
Zhenyu Chen
Nanjing University
Intelligent Software Engineering
Yang Liu
Yang Liu
Nanyang Technological University
AgentSoftware EngineeringCyber SecurityTrustworthy AISoftware Security