MAGENTA: Magnitude and Geometry-ENhanced Training Approach for Robust Long-Tailed Sound Event Localization and Detection

πŸ“… 2025-09-19
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
In real-world scenarios, long-tailed class distributions severely degrade the performance of deep learning-based sound event localization and detection (SELD), as standard regression losses inherently bias optimization toward high-frequency classes, undermining modeling fidelity for rare events. To address this, we propose MAGENTAβ€”a novel method that, for the first time, unifies magnitude (radial) and direction (angular) regression errors within an interpretable vector space. MAGENTA introduces a rarity-aware geometric decomposition loss, grounded in physical principles, to explicitly guide optimization and enhance model sensitivity and robustness to infrequent events. Extensive experiments on realistic long-tailed SELD benchmarks demonstrate that MAGENTA significantly improves both localization accuracy and event detection F1-score. This work establishes the first geometric error-decoupling optimization framework tailored for long-tailed acoustic perception tasks.

Technology Category

Application Category

πŸ“ Abstract
Deep learning-based Sound Event Localization and Detection (SELD) systems degrade significantly on real-world, long-tailed datasets. Standard regression losses bias learning toward frequent classes, causing rare events to be systematically under-recognized. To address this challenge, we introduce MAGENTA (Magnitude And Geometry-ENhanced Training Approach), a unified loss function that counteracts this bias within a physically interpretable vector space. MAGENTA geometrically decomposes the regression error into radial and angular components, enabling targeted, rarity-aware penalties and strengthened directional modeling. Empirically, MAGENTA substantially improves SELD performance on imbalanced real-world data, providing a principled foundation for a new class of geometry-aware SELD objectives. Code is available at: https://github.com/itsjunwei/MAGENTA_ICASSP
Problem

Research questions and friction points this paper is trying to address.

Addresses class imbalance in sound event detection
Reduces bias against rare sound events
Improves directional modeling in SELD systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Geometrically decomposes regression error into components
Enables rarity-aware penalties and directional modeling
Unified loss function counteracts bias in vector space
πŸ”Ž Similar Papers
No similar papers found.
J
Jun-Wei Yeow
Smart Nation TRANS Lab, School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore
E
Ee-Leng Tan
Smart Nation TRANS Lab, School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore
S
Santi Peksi
Smart Nation TRANS Lab, School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore
Woon-Seng Gan
Woon-Seng Gan
Professor of Audio Engineering and Director of Smart Nation Lab @ Nanyang Technological University,
Active Noise ControlMachine & Deep LearningSpatial AudioPerceptual Evaluation