Multi-scale Scanning Network for Machine Anomalous Sound Detection

📅 2025-08-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address insufficient modeling of multi-scale time-frequency patterns in machine anomaly sound detection, this paper proposes the Multi-Scale Scanning Network (MS-Net). MS-Net employs parallel convolutional kernels of varying sizes to scan Mel-spectrogram inputs, explicitly capturing recurrent time-frequency structures across scales. A lightweight, shared-weight convolutional design is introduced to reduce parameter count while enhancing generalization and scalability. This work constitutes the first systematic end-to-end modeling of regular multi-scale acoustic patterns inherent in industrial machinery sounds for anomaly detection. Evaluated on the DCASE 2020 and 2023 Task 2 benchmarks, MS-Net achieves state-of-the-art performance, demonstrating significant robustness to heterogeneous machine types and noisy acoustic environments. The results empirically validate the critical importance of multi-scale structural modeling for industrial acoustic anomaly detection.

Technology Category

Application Category

📝 Abstract
Machine sounds exhibit consistent and repetitive patterns in both the frequency and time domains, which vary significantly across scales for different machine types. For instance, rotating machines often show periodic features in short time intervals, while reciprocating machines exhibit broader patterns spanning the time domain. While prior studies have leveraged these patterns to improve Anomalous Sound Detection (ASD), the variation of patterns across scales remains insufficiently explored. To address this gap, we introduce a Multi-scale Scanning Network (MSN) designed to capture patterns at multiple scales. MSN employs kernel boxes of varying sizes to scan audio spectrograms and integrates a lightweight convolutional network with shared weights for efficient and scalable feature representation. Experimental evaluations on the DCASE 2020 and DCASE 2023 Task 2 datasets demonstrate that MSN achieves state-of-the-art performance, highlighting its effectiveness in advancing ASD systems.
Problem

Research questions and friction points this paper is trying to address.

Detecting anomalous machine sounds across varying frequency and time scales
Capturing multi-scale patterns in audio spectrograms for different machine types
Improving anomalous sound detection performance through scalable feature representation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-scale Scanning Network captures patterns
Varying kernel boxes scan audio spectrograms
Lightweight convolutional network with shared weights
🔎 Similar Papers
No similar papers found.
Yucong Zhang
Yucong Zhang
Ph.D. Student in CS, Wuhan University
Juan Liu
Juan Liu
Wuhan University
Data MiningArtificial Intelligence in BioinformaticsBiomedicine
M
Ming Li
School of Computer Science, Wuhan University, Wuhan, China; Suzhou Municipal Key Laboratory of Multimodal Intelligent Systems, Digital Innovation Research Center, Duke Kunshan University, Suzhou, China