MAGENTA: Magnitude and Geometry-ENhanced Training Approach for Robust Long-Tailed Sound Event Localization and Detection

📅 2025-09-19

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

In real-world scenarios, long-tailed class distributions severely degrade the performance of deep learning-based sound event localization and detection (SELD), as standard regression losses inherently bias optimization toward high-frequency classes, undermining modeling fidelity for rare events. To address this, we propose MAGENTA—a novel method that, for the first time, unifies magnitude (radial) and direction (angular) regression errors within an interpretable vector space. MAGENTA introduces a rarity-aware geometric decomposition loss, grounded in physical principles, to explicitly guide optimization and enhance model sensitivity and robustness to infrequent events. Extensive experiments on realistic long-tailed SELD benchmarks demonstrate that MAGENTA significantly improves both localization accuracy and event detection F1-score. This work establishes the first geometric error-decoupling optimization framework tailored for long-tailed acoustic perception tasks.

Technology Category

Application Category

📝 Abstract

Deep learning-based Sound Event Localization and Detection (SELD) systems degrade significantly on real-world, long-tailed datasets. Standard regression losses bias learning toward frequent classes, causing rare events to be systematically under-recognized. To address this challenge, we introduce MAGENTA (Magnitude And Geometry-ENhanced Training Approach), a unified loss function that counteracts this bias within a physically interpretable vector space. MAGENTA geometrically decomposes the regression error into radial and angular components, enabling targeted, rarity-aware penalties and strengthened directional modeling. Empirically, MAGENTA substantially improves SELD performance on imbalanced real-world data, providing a principled foundation for a new class of geometry-aware SELD objectives. Code is available at: https://github.com/itsjunwei/MAGENTA_ICASSP

Problem

Research questions and friction points this paper is trying to address.

Addresses class imbalance in sound event detection

Reduces bias against rare sound events

Improves directional modeling in SELD systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Geometrically decomposes regression error into components

Enables rarity-aware penalties and directional modeling

Unified loss function counteracts bias in vector space

🔎 Similar Papers

TF-Mamba: A Time-Frequency Network for Sound Source Localization