EO-VLM: VLM-Guided Energy Overload Attacks on Vision Models

📅 2025-04-11

📈 Citations: 0

✨ Influential: 0

career value

237K/year

🤖 AI Summary

This work uncovers a critical energy-efficiency security vulnerability in vision models: adversaries can exploit vision-language models (VLMs) to launch resource-exhaustion energy-overload attacks, generating human-imperceptible adversarial images that significantly increase the GPU power consumption of target models—thereby compromising the availability of safety-critical systems such as autonomous driving and video surveillance. We propose the first VLM-guided, target-model-agnostic, architecture-agnostic energy-overload attack paradigm, requiring only prompt engineering—no white-box access, gradient information, or prior knowledge of the victim model. Crucially, we are the first to demonstrate that VLMs can be maliciously repurposed as physical-resource-layer attack tools. Extensive evaluation across diverse mainstream vision models shows up to a 50% increase in GPU energy consumption, revealing a previously unaddressed dimension of energy-security risk in vision AI systems.

Technology Category

Application Category

📝 Abstract

Vision models are increasingly deployed in critical applications such as autonomous driving and CCTV monitoring, yet they remain susceptible to resource-consuming attacks. In this paper, we introduce a novel energy-overloading attack that leverages vision language model (VLM) prompts to generate adversarial images targeting vision models. These images, though imperceptible to the human eye, significantly increase GPU energy consumption across various vision models, threatening the availability of these systems. Our framework, EO-VLM (Energy Overload via VLM), is model-agnostic, meaning it is not limited by the architecture or type of the target vision model. By exploiting the lack of safety filters in VLMs like DALL-E 3, we create adversarial noise images without requiring prior knowledge or internal structure of the target vision models. Our experiments demonstrate up to a 50% increase in energy consumption, revealing a critical vulnerability in current vision models.

Problem

Research questions and friction points this paper is trying to address.

Energy-overloading attacks on vision models via VLM prompts

Adversarial images increase GPU energy consumption imperceptibly

Exploiting VLM safety gaps to threaten vision model availability

Innovation

Methods, ideas, or system contributions that make the work stand out.

VLM-guided adversarial images for energy attacks

Model-agnostic framework targeting vision models

Exploiting VLM safety filter gaps

🔎 Similar Papers

ImgTrojan: Jailbreaking Vision-Language Models with ONE Image