APEX: Audio Prototype EXplanations for Classification Tasks

📅 2026-05-11

📈 Citations: 0

✨ Influential: 0

career value

233K/year

🤖 AI Summary

This work addresses the lack of interpretable methods tailored to the acoustic characteristics of audio classification, a domain where visual attribution techniques are often naively applied. To bridge this gap, the authors propose APEX, a post-hoc explanation framework that requires no fine-tuning of the original model. APEX introduces multi-view prototype reasoning to the audio domain for the first time, leveraging four distinct prototype types—time-domain, frequency-domain, time-frequency joint, and local patch—to disentangle and explain input signals. By generating instance-level explanations grounded in acoustic similarity while preserving the model’s original predictions, APEX significantly enhances semantic clarity and alignment with human auditory perception, outperforming conventional gradient-based attribution methods.

📝 Abstract

Explainable AI (XAI) has achieved remarkable success in image classification, yet the audio domain lacks equally mature solutions. Current methods apply vision-based attribution techniques to spectrograms, overlooking fundamental differences between visual and acoustic signals. While prototype reasoning is promising, acoustic similarity remains multidimensional. We introduce APEX (Audio Prototype EXplanations), a post-hoc framework for interpreting pre-trained audio classifiers. Crucially, APEX requires no fine-tuning of the original backbone and strictly preserves output invariance. APEX disentangles explanations into four perspectives: Square-based prototypes to localize transient events, Time-based for temporal patterns, Frequency-based highlighting spectral bands, and Time-Frequency-based integrating both. This yields intuitive, example-based explanations that respect acoustic properties, providing greater semantic clarity than standard gradient-based methods.

Problem

Research questions and friction points this paper is trying to address.

Explainable AI

audio classification

prototype reasoning

acoustic similarity

spectrogram interpretation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Audio Explainable AI

Prototype-based Explanation

Post-hoc Interpretability