NutVLM: A Self-Adaptive Defense Framework against Full-Dimension Attacks for Vision Language Models in Autonomous Driving

📅 2026-02-09

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

This work addresses the vulnerability of vision-language models (VLMs) in autonomous driving to diverse adversarial attacks, including localized physical patches and global imperceptible perturbations, for which existing defenses struggle to balance robustness and performance on clean samples. To this end, we propose NutVLM, an adaptive defense framework spanning the entire perception-to-decision pipeline. Our approach introduces a novel three-class detection mechanism coupled with lightweight purification strategies: grayscale masking mitigates local attacks, while gradient-based optimization combined with discrete projection generates corrected driving prompts for global perturbations. Furthermore, we incorporate Expert-guided Adversarial Prompt Tuning (EAPT) to refine prompts without full-model fine-tuning. Evaluated on the Dolphins benchmark, NutVLM achieves a 4.89% overall improvement, significantly enhancing accuracy, language quality, and GPT-based scores, thereby demonstrating its effectiveness and scalability.

Technology Category

Application Category

📝 Abstract

Vision Language Models (VLMs) have advanced perception in autonomous driving (AD), but they remain vulnerable to adversarial threats. These risks range from localized physical patches to imperceptible global perturbations. Existing defense methods for VLMs remain limited and often fail to reconcile robustness with clean-sample performance. To bridge these gaps, we propose NutVLM, a comprehensive self-adaptive defense framework designed to secure the entire perception-decision lifecycle. Specifically, we first employ NutNet++ as a sentinel, which is a unified detection-purification mechanism. It identifies benign samples, local patches, and global perturbations through three-way classification. Subsequently, localized threats are purified via efficient grayscale masking, while global perturbations trigger Expert-guided Adversarial Prompt Tuning (EAPT). Instead of the costly parameter updates of full-model fine-tuning, EAPT generates"corrective driving prompts"via gradient-based latent optimization and discrete projection. These prompts refocus the VLM's attention without requiring exhaustive full-model retraining. Evaluated on the Dolphins benchmark, our NutVLM yields a 4.89% improvement in overall metrics (e.g., Accuracy, Language Score, and GPT Score). These results validate NutVLM as a scalable security solution for intelligent transportation. Our code is available at https://github.com/PXX/NutVLM.

Problem

Research questions and friction points this paper is trying to address.

Vision Language Models

Adversarial Attacks

Autonomous Driving

Robustness

Defense Framework

Innovation

Methods, ideas, or system contributions that make the work stand out.

self-adaptive defense

vision language models

adversarial robustness