🤖 AI Summary
In unsupervised anomaly sound detection (ASD) for industrial scenarios lacking both labeled anomalies and operational-state annotations—particularly when similar-machine data is unavailable—model performance degrades significantly. To address this, we propose an end-to-end unsupervised method requiring only normal acoustic data from the target machine. Our approach introduces three key innovations: (1) a novel pseudo-anomalous dataset construction mechanism based on anomaly-score thresholding; (2) a triplet-based deep metric learning framework for iterative pseudo-label refinement; and (3) a cross-domain external audio adaptation strategy to enhance generalization. Evaluated on the DCASE 2022–2024 Task 2 unsupervised benchmark, our method achieves over a 6.6 percentage-point improvement in AUC. Notably, it also yields consistent performance gains under supervised settings, demonstrating superior robustness and cross-scenario generalizability.
📝 Abstract
This paper addresses performance degradation in anomalous sound detection (ASD) when neither sufficiently similar machine data nor operational state labels are available. We present an integrated pipeline that combines three complementary components derived from prior work and extends them to the unlabeled ASD setting. First, we adapt an anomaly score based selector to curate external audio data resembling the normal sounds of the target machine. Second, we utilize triplet learning to assign pseudo-labels to unlabeled data, enabling finer classification of operational sounds and detection of subtle anomalies. Third, we employ iterative training to refine both the pseudo-anomalous set selection and pseudo-label assignment, progressively improving detection accuracy. Experiments on the DCASE2022-2024 Task 2 datasets demonstrate that, in unlabeled settings, our approach achieves an average AUC increase of over 6.6 points compared to conventional methods. In labeled settings, incorporating external data from the pseudo-anomalous set further boosts performance. These results highlight the practicality and robustness of our methods in scenarios with scarce machine data and labels, facilitating ASD deployment across diverse industrial settings with minimal annotation effort.