Phantom Menace: Exploring and Enhancing the Robustness of VLA Models against Physical Sensor Attacks

📅 2025-11-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work presents the first systematic study on the robustness of Vision-Language-Action (VLA) models under physical sensor attacks. Addressing the critical gap in security evaluation—where existing VLA models lack rigorous assessment against real-world camera (six attack types) and microphone (two attack types) perturbations—we propose a “Real-Sim-Real” simulation-validation framework that enables automated attack modeling, physics-based simulation, and closed-loop validation in real environments. We identify task semantics and model architecture as key determinants of robustness and introduce an out-of-distribution-aware adversarial training method. Extensive experiments across mainstream VLA models demonstrate substantial performance degradation under physical attacks; our approach improves attack resilience by 32.7% on average while preserving original task accuracy. This work establishes the first reproducible, scalable analytical paradigm and defense framework for securing multimodal embodied AI systems against physical-world sensor threats.

Technology Category

Application Category

📝 Abstract
Vision-Language-Action (VLA) models revolutionize robotic systems by enabling end-to-end perception-to-action pipelines that integrate multiple sensory modalities, such as visual signals processed by cameras and auditory signals captured by microphones. This multi-modality integration allows VLA models to interpret complex, real-world environments using diverse sensor data streams. Given the fact that VLA-based systems heavily rely on the sensory input, the security of VLA models against physical-world sensor attacks remains critically underexplored. To address this gap, we present the first systematic study of physical sensor attacks against VLAs, quantifying the influence of sensor attacks and investigating the defenses for VLA models. We introduce a novel ``Real-Sim-Real''framework that automatically simulates physics-based sensor attack vectors, including six attacks targeting cameras and two targeting microphones, and validates them on real robotic systems. Through large-scale evaluations across various VLA architectures and tasks under varying attack parameters, we demonstrate significant vulnerabilities, with susceptibility patterns that reveal critical dependencies on task types and model designs. We further develop an adversarial-training-based defense that enhances VLA robustness against out-of-distribution physical perturbations caused by sensor attacks while preserving model performance. Our findings expose an urgent need for standardized robustness benchmarks and mitigation strategies to secure VLA deployments in safety-critical environments.
Problem

Research questions and friction points this paper is trying to address.

Investigating VLA model vulnerabilities to physical sensor attacks
Developing defense mechanisms against adversarial sensor manipulations
Establishing robustness benchmarks for safety-critical VLA deployments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Real-Sim-Real framework simulates physical sensor attacks
Adversarial training defends against out-of-distribution perturbations
Systematically evaluates camera and microphone attack vulnerabilities
🔎 Similar Papers
No similar papers found.
X
Xuancun Lu
USSLAB, Zhejiang University
Jiaxiang Chen
Jiaxiang Chen
Fudan university
S
Shilin Xiao
USSLAB, Zhejiang University
Z
Zizhi Jin
USSLAB, Zhejiang University
Z
Zhangrui Chen
ZJU-UIUC Institute, Zhejiang University
H
Hanwen Yu
ZJU-UIUC Institute, Zhejiang University
B
Bohan Qian
ZJU-UIUC Institute, Zhejiang University
Ruochen Zhou
Ruochen Zhou
City University of Hong Kong
Xiaoyu Ji
Xiaoyu Ji
Professor, Zhejiang University
IoT securitySensor SecurityAI security
Wenyuan Xu
Wenyuan Xu
Professor, IEEE Fellow, Zhejiang University, College of EE
Wireless Network SecurityEmbedded System SecurityAnalog Cyber SecurityIoT Security