Sound Event Detection with Boundary-Aware Optimization and Inference

📅 2026-01-07

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This work addresses the limitations of conventional frame-level sound event detection models, which rely on post-processing and suffer from ambiguous temporal boundaries. To overcome these issues, the authors propose an end-to-end boundary-aware modeling approach that explicitly captures event start and end times through a Recurrent Event Detection (RED) layer and an Event Proposal Network (EPN). A tailored loss function is designed to enable boundary-sensitive optimization and inference. The method operates without requiring post-processing or hyperparameter tuning and achieves state-of-the-art performance across all classes on the AudioSet Strong annotation subset, significantly outperforming existing frame-level models and post-processing-based solutions.

Technology Category

Application Category

📝 Abstract

Temporal detection problems appear in many fields including time-series estimation, activity recognition and sound event detection (SED). In this work, we propose a new approach to temporal event modeling by explicitly modeling event onsets and offsets, and by introducing boundary-aware optimization and inference strategies that substantially enhance temporal event detection. The presented methodology incorporates new temporal modeling layers - Recurrent Event Detection (RED) and Event Proposal Network (EPN) - which, together with tailored loss functions, enable more effective and precise temporal event detection. We evaluate the proposed method in the SED domain using a subset of the temporally-strongly annotated portion of AudioSet. Experimental results show that our approach not only outperforms traditional frame-wise SED models with state-of-the-art post-processing, but also removes the need for post-processing hyperparameter tuning, and scales to achieve new state-of-the-art performance across all AudioSet Strong classes.

Problem

Research questions and friction points this paper is trying to address.

Sound Event Detection

Temporal Detection

Event Boundary

Onset and Offset

Boundary-aware

Innovation

Methods, ideas, or system contributions that make the work stand out.

Boundary-Aware Optimization

Sound Event Detection

Event Proposal Network