Discovering and Causally Validating Emotion-Sensitive Neurons in Large Audio-Language Models

📅 2026-01-06

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

224K/year

🤖 AI Summary

This study addresses the lack of neuron-level understanding of emotion encoding mechanisms in current large audio-language models. It provides the first causal evidence for emotion-sensitive neurons by identifying candidate units through metrics such as frequency, entropy, amplitude, and contrast, followed by ablation and gain-of-function interventions. Experiments on Qwen2.5-Omni, Kimi-Audio, and Audio Flamingo 3 demonstrate that these identified neurons exhibit emotion specificity, intervenability, and partial cross-dataset transferability. Systematic modulation of these neurons consistently alters the models’ emotion recognition performance, confirming their critical role in affective decision-making.

Technology Category

Application Category

📝 Abstract

Emotion is a central dimension of spoken communication, yet, we still lack a mechanistic account of how modern large audio-language models (LALMs) encode it internally. We present the first neuron-level interpretability study of emotion-sensitive neurons (ESNs) in LALMs and provide causal evidence that such units exist in Qwen2.5-Omni, Kimi-Audio, and Audio Flamingo 3. Across these three widely used open-source models, we compare frequency-, entropy-, magnitude-, and contrast-based neuron selectors on multiple emotion recognition benchmarks. Using inference-time interventions, we reveal a consistent emotion-specific signature: ablating neurons selected for a given emotion disproportionately degrades recognition of that emotion while largely preserving other classes, whereas gain-based amplification steers predictions toward the target emotion. These effects arise with modest identification data and scale systematically with intervention strength. We further observe that ESNs exhibit non-uniform layer-wise clustering with partial cross-dataset transfer. Taken together, our results offer a causal, neuron-level account of emotion decisions in LALMs and highlight targeted neuron interventions as an actionable handle for controllable affective behaviors.

Problem

Research questions and friction points this paper is trying to address.

emotion-sensitive neurons

large audio-language models

neuron-level interpretability

causal validation

affective computing

Innovation

Methods, ideas, or system contributions that make the work stand out.

emotion-sensitive neurons

causal intervention

neuron interpretability