FreezeVLA: Action-Freezing Attacks against Vision-Language-Action Models

📅 2025-09-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work identifies and formalizes a novel adversarial vulnerability—“action freezing”—where adversarial images cause Vision-Language-Action (VLA) models to enter persistent response stagnation, ignoring subsequent linguistic instructions and rendering robots inactive during critical intervention phases. To address this, we propose the first systematic adversarial attack framework tailored for VLA models, featuring a minimax bilevel optimization method that generates adversarial images with high attack success rates and strong cross-instruction transferability. Extensive experiments across three state-of-the-art VLA models and four robotic benchmarks demonstrate an average attack success rate of 76.2%, confirming a substantial risk of operational paralysis in real-world deployments of multimodal embodied AI systems. Our work establishes both theoretical foundations and empirical evidence for security evaluation and robustness enhancement of VLA models.

Technology Category

Application Category

📝 Abstract
Vision-Language-Action (VLA) models are driving rapid progress in robotics by enabling agents to interpret multimodal inputs and execute complex, long-horizon tasks. However, their safety and robustness against adversarial attacks remain largely underexplored. In this work, we identify and formalize a critical adversarial vulnerability in which adversarial images can "freeze" VLA models and cause them to ignore subsequent instructions. This threat effectively disconnects the robot's digital mind from its physical actions, potentially inducing inaction during critical interventions. To systematically study this vulnerability, we propose FreezeVLA, a novel attack framework that generates and evaluates action-freezing attacks via min-max bi-level optimization. Experiments on three state-of-the-art VLA models and four robotic benchmarks show that FreezeVLA attains an average attack success rate of 76.2%, significantly outperforming existing methods. Moreover, adversarial images generated by FreezeVLA exhibit strong transferability, with a single image reliably inducing paralysis across diverse language prompts. Our findings expose a critical safety risk in VLA models and highlight the urgent need for robust defense mechanisms.
Problem

Research questions and friction points this paper is trying to address.

Identifies adversarial vulnerability freezing VLA models
Examines safety risks from image-induced instruction ignoring
Proposes attack framework to paralyze robotic actions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Min-max bi-level optimization generates adversarial images
Action-freezing attacks cause VLA models to ignore instructions
Single adversarial image transfers across diverse language prompts
X
Xin Wang
Fudan University
J
Jie Li
Shanghai AI Lab
Zejia Weng
Zejia Weng
Fudan University
computer visionvideo understandingmulti modal learning
Y
Yixu Wang
Fudan University
Y
Yifeng Gao
Fudan University
T
Tianyu Pang
Sea AI Lab
C
Chao Du
Sea AI Lab
Y
Yan Teng
Shanghai AI Lab
Y
Yingchun Wang
Shanghai AI Lab
Zuxuan Wu
Zuxuan Wu
Fudan University
Xingjun Ma
Xingjun Ma
Fudan University
Trustworthy AIMultimodal AIGenerative AIEmbodied AI
Yu-Gang Jiang
Yu-Gang Jiang
Professor, Fudan University. IEEE & IAPR Fellow
Video AnalysisEmbodied AITrustworthy AI