SteerVLM: Robust Model Control through Lightweight Activation Steering for Vision Language Models

📅 2025-10-30

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

This work addresses the problem of achieving fine-grained instruction following in vision-language models (VLMs) without modifying their pretrained weights. We propose a lightweight activation-guidance module that dynamically modulates semantic interactions between visual and linguistic modalities via dimension-wise activation modulation and cross-layer adaptive guidance—requiring no predefined intervention layers or static control vectors, and introducing only 0.14% additional parameters. Our method learns latent-space embeddings for target and counterfactual behaviors using a novel multimodal dataset, VNIA, specifically curated for training and evaluation. Experiments demonstrate that our approach significantly outperforms existing intervention methods on instruction-following and hallucination suppression tasks, while preserving performance on non-target tasks. This validates activation engineering as an effective paradigm for controllable multimodal reasoning.

Technology Category

Application Category

📝 Abstract

This work introduces SteerVLM, a lightweight steering module designed to guide Vision-Language Models (VLMs) towards outputs that better adhere to desired instructions. Our approach learns from the latent embeddings of paired prompts encoding target and converse behaviors to dynamically adjust activations connecting the language modality with image context. This allows for fine-grained, inference-time control over complex output semantics without modifying model weights while preserving performance on off-target tasks. Our steering module requires learning parameters equal to 0.14% of the original VLM's size. Our steering module gains model control through dimension-wise activation modulation and adaptive steering across layers without requiring pre-extracted static vectors or manual tuning of intervention points. Furthermore, we introduce VNIA (Visual Narrative Intent Alignment), a multimodal dataset specifically created to facilitate the development and evaluation of VLM steering techniques. Our method outperforms existing intervention techniques on steering and hallucination mitigation benchmarks for VLMs and proposes a robust solution for multimodal model control through activation engineering.

Problem

Research questions and friction points this paper is trying to address.

Lightweight module guides Vision-Language Models to follow instructions

Dynamically adjusts activations for fine-grained output control

Mitigates model hallucinations while preserving off-task performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight steering module for vision-language models

Dynamic activation adjustment using latent embeddings

Dimension-wise modulation without weight modification

🔎 Similar Papers

No similar papers found.