ReasonBreak: Probing Vulnerabilities in Reasoning-Enabled Vision-Language-Action Models for Autonomous Driving

📅 2026-05-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study systematically evaluates the robustness of reasoning-capable vision-language-action (VLA) models under realistic textual perturbations, revealing critical risks of reasoning errors and trajectory deviations in end-to-end autonomous driving caused by adversarial language inputs. It presents the first black-box safety assessment of industrial-scale reasoning-based VLA models—such as NVIDIA Alpamayo—and introduces a novel evaluation framework that integrates semantic and structural reasoning awareness. The work also establishes the first benchmark for adversarial evaluation focused on the interaction between reasoning and trajectory control. Through a closed-loop simulation platform combining black-box attacks with metrics for reasoning quality and driving safety, experiments achieve up to 89% reasoning attack success rate and 72% trajectory manipulation success rate, significantly degrading driving safety and exposing fundamental vulnerabilities in current models.
📝 Abstract
Vision-Language-Action (VLA) models with integrated reasoning have been proposed for end-to-end autonomous driving, assuming a tight coupling between reasoning and trajectory generation. However, the robustness of such systems under realistic input perturbations remains largely unexplored. We show that these models are highly vulnerable to realistic input perturbations, achieving up to 89% attack success rate (ASR) on reasoning and up to 72% on trajectory manipulation in closed-loop simulation, leading to increased collision rates and degraded safety metrics. Using NVIDIA's recent Alpamayo models as representative industry-developed VLAs, we conduct the first systematic black-box study of reasoning-enabled VLA models under realistic textual input corruptions, evaluating their impact on reasoning and driving behavior. We introduce a reasoning-aware evaluation framework capturing both semantic and structural aspects of reasoning, along with safety-centric measures. We also introduce a benchmark for evaluating attacks and defenses on reasoning-trajectory interactions in autonomous driving. Our results highlight the need for rigorous evaluation and improved defenses to ensure the safety of reasoning-enabled VLA systems in autonomous driving.
Problem

Research questions and friction points this paper is trying to address.

Vision-Language-Action models
reasoning robustness
autonomous driving
input perturbations
safety evaluation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Vision-Language-Action models
reasoning robustness
input perturbation
autonomous driving safety
black-box evaluation