ESTM: An Enhanced Dual-Branch Spectral-Temporal Mamba for Anomalous Sound Detection

📅 2025-09-02

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Industrial acoustic anomaly detection (ASD) faces two key challenges: difficulty in modeling time-frequency coupling in acoustic features and weak capture of long-range temporal dependencies. To address these, we propose a novel dual-branch Mamba architecture—the first to integrate selective state space models (SSMs) into industrial ASD. Our design employs a time-frequency decoupled dual-path structure to separately model temporal dynamics and spectral-band interactions, augmented by a TriStat-Gating (TSG) module for adaptive feature gating. Furthermore, we fuse enhanced Mel-spectrogram representations with raw waveform features to improve sensitivity to subtle anomalies. Evaluated on the DCASE 2020 Task 2 benchmark, our method achieves significant improvements over existing state-of-the-art approaches, demonstrating superior effectiveness and robustness under complex industrial noise conditions.

Technology Category

Application Category

📝 Abstract

The core challenge in industrial equipment anoma lous sound detection (ASD) lies in modeling the time-frequency coupling characteristics of acoustic features. Existing modeling methods are limited by local receptive fields, making it difficult to capture long-range temporal patterns and cross-band dynamic coupling effects in machine acoustic features. In this paper, we propose a novel framework, ESTM, which is based on a dual-path Mamba architecture with time-frequency decoupled modeling and utilizes Selective State-Space Models (SSM) for long-range sequence modeling. ESTM extracts rich feature representations from different time segments and frequency bands by fusing enhanced Mel spectrograms and raw audio features, while further improving sensitivity to anomalous patterns through the TriStat-Gating (TSG) module. Our experiments demonstrate that ESTM improves anomalous detection performance on the DCASE 2020 Task 2 dataset, further validating the effectiveness of the proposed method.

Problem

Research questions and friction points this paper is trying to address.

Modeling time-frequency coupling in acoustic features

Capturing long-range temporal patterns in machine sounds

Detecting cross-band dynamic coupling effects in audio

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-path Mamba architecture for time-frequency decoupling

Selective State-Space Models for long-range sequence modeling

TriStat-Gating module enhances anomalous pattern sensitivity

🔎 Similar Papers

No similar papers found.

Authors to Follow