VLA-Hijack: A Transferable Patch Attack against Vision-Language-Action Models via Visual Proprioception Hijacking

📅 2026-05-27

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

This work addresses the vulnerability of existing Vision-Language-Action (VLA) models to adversarial patch attacks in safety-critical scenarios and their poor cross-architecture transferability under current attack methods. The authors propose a unified adversarial framework that, for the first time, identifies and exploits a shared visual egocentric perception vulnerability across VLA models. By alternately performing semantic anchoring and visual prototype projection through attention-guided egocentric suppression and multimodal egocentric injection, the method significantly enhances white-box optimization efficiency. It achieves state-of-the-art performance in black-box, cross-architecture, and cross-domain transfer attacks on OpenVLA, UniVLA, and CronusVLA, demonstrating both broad applicability and high effectiveness.

📝 Abstract

While Vision-Language-Action (VLA) models have emerged as powerful generalist policies, their severe vulnerability to adversarial patches significantly hinders their deployment in safety-critical domains. Moreover, existing patch attacks primarily focus on white-box settings, heavily overfitting to the specific action output space of the target model, which results in poor cross-architecture transferability. To overcome this limitation, we propose VLA-Hijack, a unified adversarial framework that breaks the transferability bottleneck by exploiting a fundamental vulnerability identified in this work: before planning any motion, a VLA model must first use visual information to locate its own robotic arm within the environment. Targeting this shared visual self-localization process, our approach concurrently optimizes Attention-Guided Proprioceptive Suppression to inhibit the real robotic arm's features, and Multimodal Proprioceptive Injection to establish the patch as a surrogate "phantom embodiment". By alternating between semantic concept anchoring and visual prototype projection, VLA-Hijack effectively severs the semantic relationship between the agent's true embodiment and its control policy. Extensive experiments across diverse architectures (OpenVLA, UniVLA, and CronusVLA) demonstrate that VLA-Hijack achieves superior optimization efficiency in white-box settings and sets a new SOTA for cross-architecture and cross-domain black-box transferability.

Problem

Research questions and friction points this paper is trying to address.

Vision-Language-Action models

adversarial patch attack

transferability

visual proprioception

cross-architecture

Innovation

Methods, ideas, or system contributions that make the work stand out.

adversarial patch attack

vision-language-action models

visual proprioception hijacking