What Do Neurons Listen To? A Neuron-level Dissection of a General-purpose Audio Model

📅 2026-02-16

📈 Citations: 0

✨ Influential: 0

career value

249K/year

🤖 AI Summary

This study addresses the limited interpretability of general-purpose audio self-supervised learning (SSL) models at the neuron level. It presents the first systematic neuron-level dissection by introducing a mechanism interpretability framework grounded in conditional activation patterns, coupled with semantic and acoustic similarity assessments. The analysis reveals the existence of category-specific neurons within these models that are selectively responsive to semantic audio features such as vocal attributes and pitch. Further validation demonstrates that these neurons not only generalize across diverse downstream tasks but also make significant functional contributions to classification performance, highlighting their role in the model’s representational capacity and transferability.

Technology Category

Application Category

📝 Abstract

In this paper, we analyze the internal representations of a general-purpose audio self-supervised learning (SSL) model from a neuron-level perspective. Despite their strong empirical performance as feature extractors, the internal mechanisms underlying the robust generalization of SSL audio models remain unclear. Drawing on the framework of mechanistic interpretability, we identify and examine class-specific neurons by analyzing conditional activation patterns across diverse tasks. Our analysis reveals that SSL models foster the emergence of class-specific neurons that provide extensive coverage across novel task classes. These neurons exhibit shared responses across different semantic categories and acoustic similarities, such as speech attributes and musical pitch. We also confirm that these neurons have a functional impact on classification performance. To our knowledge, this is the first systematic neuron-level analysis of a general-purpose audio SSL model, providing new insights into its internal representation.

Problem

Research questions and friction points this paper is trying to address.

neuron-level analysis

self-supervised learning

audio representation

class-specific neurons

mechanistic interpretability

Innovation

Methods, ideas, or system contributions that make the work stand out.

neuron-level analysis

self-supervised learning

class-specific neurons