From Behavioral Performance to Internal Competence: Interpreting Vision-Language Models with VLM-Lens

📅 2025-10-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing evaluations of vision-language models (VLMs) rely predominantly on behavioral metrics, lacking systematic, representation-level interpretability analysis. Method: We introduce VLM-Lens—the first open-source interpretability toolkit supporting multiple VLM versions—featuring a unified YAML-based configuration interface to extract intermediate hidden representations from arbitrary layers while abstracting architectural heterogeneity; its modular design accommodates 16 mainstream VLMs and over 30 variants, integrates diverse interpretability methods, and enables zero-code extension to new models. Contribution/Results: Empirical evaluation demonstrates VLM-Lens’s effectiveness in cross-model and cross-layer conceptual representation analysis, uncovering hierarchical evolution patterns and systematic differences in concept activation profiles. The toolkit establishes a reproducible, extensible foundation for probing the internal mechanisms of VLMs.

Technology Category

Application Category

📝 Abstract
We introduce VLM-Lens, a toolkit designed to enable systematic benchmarking, analysis, and interpretation of vision-language models (VLMs) by supporting the extraction of intermediate outputs from any layer during the forward pass of open-source VLMs. VLM-Lens provides a unified, YAML-configurable interface that abstracts away model-specific complexities and supports user-friendly operation across diverse VLMs. It currently supports 16 state-of-the-art base VLMs and their over 30 variants, and is extensible to accommodate new models without changing the core logic. The toolkit integrates easily with various interpretability and analysis methods. We demonstrate its usage with two simple analytical experiments, revealing systematic differences in the hidden representations of VLMs across layers and target concepts. VLM-Lens is released as an open-sourced project to accelerate community efforts in understanding and improving VLMs.
Problem

Research questions and friction points this paper is trying to address.

Systematically benchmark and interpret vision-language models' internal workings
Extract intermediate outputs from any layer during VLM forward pass
Provide unified configurable interface for diverse VLM analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Extracts intermediate outputs from VLM layers
Provides unified YAML-configurable interface for VLMs
Supports extensible integration with interpretability methods
H
Hala Sheta
University of Waterloo
Eric Huang
Eric Huang
Associate Director of Data Science at Chewy
network optimization/supply chain/logistics
S
Shuyu Wu
University of Michigan
I
Ilia Alenabi
University of Waterloo
J
Jiajun Hong
Stony Brook University
R
Ryker Lin
University of Waterloo
R
Ruoxi Ning
University of Waterloo
D
Daniel Wei
University of Waterloo
Jialin Yang
Jialin Yang
University of Calgary
J
Jiawei Zhou
Stony Brook University
Ziqiao Ma
Ziqiao Ma
University of Michigan
Machine LearningComputational Linguistics
Freda Shi
Freda Shi
Assistant Professor of Computer Science, University of Waterloo
Language GroundingMultilingualismComputational LinguisticsArtificial Intelligence