State Backdoor: Towards Stealthy Real-world Poisoning Attack on Vision-Language-Action Model in State Space

๐Ÿ“… 2026-01-07
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

236K/year
๐Ÿค– AI Summary
This work addresses the vulnerability of existing vision-language-action (VLA) models to backdoor attacks in real-world deployment, where conventional methods relying on visible visual triggers suffer from poor stealth and limited robustness. We propose a novel backdoor attack that exploits the initial pose of a robotic arm as a spatial trigger, inducing targeted misbehavior without degrading normal task performance. To efficiently identify minimal yet effective trigger configurations, we introduce a preference-guided genetic algorithm (PGA). Extensive evaluation across five state-of-the-art VLA models and five real-world tasks demonstrates the attackโ€™s high effectiveness, achieving over 90% success rate while preserving benign performance. This approach significantly enhances the stealth and practicality of backdoor attacks in embodied AI systems.

Technology Category

Application Category

๐Ÿ“ Abstract
Vision-Language-Action (VLA) models are widely deployed in safety-critical embodied AI applications such as robotics. However, their complex multimodal interactions also expose new security vulnerabilities. In this paper, we investigate a backdoor threat in VLA models, where malicious inputs cause targeted misbehavior while preserving performance on clean data. Existing backdoor methods predominantly rely on inserting visible triggers into visual modality, which suffer from poor robustness and low insusceptibility in real-world settings due to environmental variability. To overcome these limitations, we introduce the State Backdoor, a novel and practical backdoor attack that leverages the robot arm's initial state as the trigger. To optimize trigger for insusceptibility and effectiveness, we design a Preference-guided Genetic Algorithm (PGA) that efficiently searches the state space for minimal yet potent triggers. Extensive experiments on five representative VLA models and five real-world tasks show that our method achieves over 90% attack success rate without affecting benign task performance, revealing an underexplored vulnerability in embodied AI systems.
Problem

Research questions and friction points this paper is trying to address.

backdoor attack
Vision-Language-Action model
state space
embodied AI
poisoning attack
Innovation

Methods, ideas, or system contributions that make the work stand out.

State Backdoor
Vision-Language-Action model
Preference-guided Genetic Algorithm
Embodied AI security
Stealthy poisoning attack
๐Ÿ”Ž Similar Papers
J
Ji Guo
Laboratory Of Intelligent Collaborative Computing, University of Electronic Science and Technology of China, China
Wenbo Jiang
Wenbo Jiang
University of Electronic Science and Technology of China
AI securityBackdoor attack
Y
Yansong Lin
Laboratory Of Intelligent Collaborative Computing, University of Electronic Science and Technology of China, China
Yijing Liu
Yijing Liu
Huazhong University of Science and Technology, National Institutes of Health
NanomedicineMicroneedleDrug DeliverySelf-assembly
Ruichen Zhang
Ruichen Zhang
Nanyang Technological University
Next-generation NetworkingEdge IntelligenceAgentic AIReinforcement learningLLM
G
Guomin Lu
Laboratory Of Intelligent Collaborative Computing, University of Electronic Science and Technology of China, China
Aiguo Chen
Aiguo Chen
University of Electronic Science and Technology of China
่”้‚ฆๅญฆไน ๏ผŒๅผบๅŒ–ๅญฆไน 
X
Xinshuo Han
College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics
Hongwei Li
Hongwei Li
IEEE Fellow, University of Electronic Science and Technology of China
SecurityPrivacyAICloud ComputingSmart Grid
D
Dusit Niyato
College of Computing and Data Science, Nanyang Technological University, Singapore