Deciphering Functions of Neurons in Vision-Language Models

📅 2025-02-10

📈 Citations: 0

✨ Influential: 0

career value

224K/year

🤖 AI Summary

This study addresses the limited interpretability of vision-language models (VLMs) by proposing the first systematic neuron functional taxonomy, categorizing neurons into three distinct types: visual, textual, and multimodal. Methodologically, it integrates statistical analysis of neuron activations, a GPT-4o–driven automated explanation framework, and a visual-neuron activation simulator, validated empirically on the LLaVA model. Key contributions include: (1) establishing the first standardized functional classification for VLM neurons; (2) designing a large-model–augmented interpretability pipeline; and (3) verifying explanation reliability via controlled activation simulation. Results demonstrate statistically significant functional specificity across the three neuron types, substantially enhancing transparency and trustworthiness of VLM internal mechanisms. The proposed methodology provides a transferable foundation for trustworthy AI development.

Technology Category

Application Category

📝 Abstract

The burgeoning growth of open-sourced vision-language models (VLMs) has catalyzed a plethora of applications across diverse domains. Ensuring the transparency and interpretability of these models is critical for fostering trustworthy and responsible AI systems. In this study, our objective is to delve into the internals of VLMs to interpret the functions of individual neurons. We observe the activations of neurons with respects to the input visual tokens and text tokens, and reveal some interesting findings. Particularly, we found that there are neurons responsible for only visual or text information, or both, respectively, which we refer to them as visual neurons, text neurons, and multi-modal neurons, respectively. We build a framework that automates the explanation of neurons with the assistant of GPT-4o. Meanwhile, for visual neurons, we propose an activation simulator to assess the reliability of the explanations for visual neurons. System statistical analyses on top of one representative VLM of LLaVA, uncover the behaviors/characteristics of different categories of neurons.

Problem

Research questions and friction points this paper is trying to address.

Interpret functions of neurons in vision-language models.

Develop framework for automated neuron explanation using GPT-4.

Assess reliability of visual neuron explanations via activation simulator.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated neuron explanation using GPT-4o

Activation simulator for visual neuron reliability

Statistical analysis of neuron behaviors in LLaVA

🔎 Similar Papers

Interpreting Neurons in Deep Vision Networks with Language Models