Beyond Transcription: Mechanistic Interpretability in ASR

📅 2025-08-21

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

Current ASR systems suffer from a fundamental lack of mechanistic understanding—particularly regarding how acoustic and semantic information dynamically evolve across layers—resulting in severe interpretability deficits. This work pioneers the systematic application of mechanistic interpretability techniques—including logit lens analysis, linear probing, and activation patching—to encoder-decoder ASR models, enabling layer-wise dissection of representational dynamics. We identify critical cross-layer interaction pathways responsible for repetition hallucinations and, for the first time, uncover latent semantic bias in deep acoustic representations: speech features undergo premature and excessive semantic grounding. These findings elucidate previously unknown internal mechanisms of ASR models and establish novel theoretical foundations—along with actionable intervention points—for enhancing model transparency, robustness, and controllability.

Technology Category

Application Category

📝 Abstract

Interpretability methods have recently gained significant attention, particularly in the context of large language models, enabling insights into linguistic representations, error detection, and model behaviors such as hallucinations and repetitions. However, these techniques remain underexplored in automatic speech recognition (ASR), despite their potential to advance both the performance and interpretability of ASR systems. In this work, we adapt and systematically apply established interpretability methods such as logit lens, linear probing, and activation patching, to examine how acoustic and semantic information evolves across layers in ASR systems. Our experiments reveal previously unknown internal dynamics, including specific encoder-decoder interactions responsible for repetition hallucinations and semantic biases encoded deep within acoustic representations. These insights demonstrate the benefits of extending and applying interpretability techniques to speech recognition, opening promising directions for future research on improving model transparency and robustness.

Problem

Research questions and friction points this paper is trying to address.

Adapting interpretability methods to ASR systems

Examining acoustic and semantic information evolution

Identifying internal dynamics causing repetition hallucinations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adapting interpretability methods to ASR systems

Applying logit lens, linear probing, activation patching

Examining acoustic and semantic information evolution

🔎 Similar Papers

No similar papers found.