Adversarial Flow Matching for Imperceptible Attacks on End-to-End Autonomous Driving

📅 2026-04-26

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

Existing end-to-end autonomous driving systems employing Transformer-based modules exhibit widespread vulnerability to visually imperceptible perturbations, yet current adversarial attack methods are limited by inefficiency in query usage, poor transferability, or stringent requirements on model transparency. This work proposes Adversarial Flow Matching (AFM)—a gray-box attack framework that introduces, for the first time, the concept of flow matching into adversarial attacks on autonomous driving. By leveraging neural average velocity fields, AFM generates highly imperceptible adversarial examples in a single step within the latent space. Requiring only the knowledge that the target system incorporates a Transformer module, AFM efficiently produces attacks with strong transferability and optimal visual imperceptibility, significantly degrading the performance of both Vision-Language-Action (VLA) and modular autonomous driving agents across diverse scenarios.

📝 Abstract

Autonomous driving (AD) is evolving towards end-to-end (E2E) frameworks through two primary paradigms: monolithic models exemplified by Vision-Language-Action (VLA), and specialized modular architectures. Despite their divergent designs, both paradigms increasingly rely on Transformer backbones for complex reasoning, potentially causing a shared vulnerability: visually imperceptible perturbations can manipulate E2E AD models into hazardous maneuvers by targeting the Transformer module. Most existing adversarial attack approaches against AD systems operate under white-box or black-box settings; yet, they typically necessitate full model transparency, or suffer from either prohibitive query latency or limited attack transferability. In this paper, we propose Adversarial Flow Matching (AFM), a novel gray-box attack framework that exploits Transformer structural vulnerabilities in E2E AD models. AFM enables efficient one-step generation of adversarial examples via a neural average velocity field. Additionally, the proposed technique yields effective and visually imperceptible attacks by synergistically perturbing the generative latent space and the neural average velocity field. Extensive experiments demonstrate that AFM achieves a superior trade-off between attack effectiveness and imperceptibility: it substantially degrades the performance of both VLA and modular AD agents across various scenarios compared to baselines, while maintaining state-of-the-art visual imperceptibility. Furthermore, adversarial examples generated by AFM exhibit robust cross-model transferability, indicating that AFM closely approximates a black-box attack setting while requiring only the prior knowledge that the target AD model incorporates a Transformer-based module.

Problem

Research questions and friction points this paper is trying to address.

adversarial attacks

end-to-end autonomous driving

Transformer vulnerability

visual imperceptibility

gray-box setting

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adversarial Flow Matching

Transformer vulnerability

gray-box attack