Low-Complexity Acoustic Scene Classification with Device Information in the DCASE 2025 Challenge

📅 2025-05-03

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

This work addresses the low-complexity acoustic scene classification task in DCASE 2025, where real-world deployment requires robustness across diverse, unknown recording devices—a challenge unaddressed by conventional device-agnostic models. Method: We propose a device-aware modeling framework that explicitly incorporates known device identifiers during inference. Our approach features a lightweight neural architecture integrating device-conditioned embeddings and device-aware feature normalization, coupled with cross-device transfer learning for efficient adaptation. Contribution/Results: The design jointly optimizes model compactness, data efficiency, and hardware awareness, significantly enhancing deployment robustness. On the ten-class scene classification benchmark, accuracy improves from 50.72% (baseline) to 51.89% upon incorporating device ID—demonstrating the efficacy of device-aware inference. To our knowledge, this is the first systematic study in the DCASE challenge to leverage device identity at inference time for adaptive modeling.

Technology Category

Application Category

📝 Abstract

This paper presents the Low-Complexity Acoustic Scene Classification with Device Information Task of the DCASE 2025 Challenge and its baseline system. Continuing the focus on low-complexity models, data efficiency, and device mismatch from previous editions (2022--2024), this year's task introduces a key change: recording device information is now provided at inference time. This enables the development of device-specific models that leverage device characteristics -- reflecting real-world deployment scenarios in which a model is designed with awareness of the underlying hardware. The training set matches the 25% subset used in the corresponding DCASE 2024 challenge, with no restrictions on external data use, highlighting transfer learning as a central topic. The baseline achieves 50.72% accuracy on this ten-class problem with a device-general model, improving to 51.89% when using the available device information.

Problem

Research questions and friction points this paper is trying to address.

Develop low-complexity models for acoustic scene classification

Address device mismatch using provided device information

Improve accuracy with device-specific models in real-world scenarios

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages device information for model development

Focuses on low-complexity and data efficiency

Employs transfer learning with external data

🔎 Similar Papers

No similar papers found.