🤖 AI Summary
This work addresses the low-complexity acoustic scene classification task in DCASE 2025, where real-world deployment requires robustness across diverse, unknown recording devices—a challenge unaddressed by conventional device-agnostic models.
Method: We propose a device-aware modeling framework that explicitly incorporates known device identifiers during inference. Our approach features a lightweight neural architecture integrating device-conditioned embeddings and device-aware feature normalization, coupled with cross-device transfer learning for efficient adaptation.
Contribution/Results: The design jointly optimizes model compactness, data efficiency, and hardware awareness, significantly enhancing deployment robustness. On the ten-class scene classification benchmark, accuracy improves from 50.72% (baseline) to 51.89% upon incorporating device ID—demonstrating the efficacy of device-aware inference. To our knowledge, this is the first systematic study in the DCASE challenge to leverage device identity at inference time for adaptive modeling.
📝 Abstract
This paper presents the Low-Complexity Acoustic Scene Classification with Device Information Task of the DCASE 2025 Challenge and its baseline system. Continuing the focus on low-complexity models, data efficiency, and device mismatch from previous editions (2022--2024), this year's task introduces a key change: recording device information is now provided at inference time. This enables the development of device-specific models that leverage device characteristics -- reflecting real-world deployment scenarios in which a model is designed with awareness of the underlying hardware. The training set matches the 25% subset used in the corresponding DCASE 2024 challenge, with no restrictions on external data use, highlighting transfer learning as a central topic. The baseline achieves 50.72% accuracy on this ten-class problem with a device-general model, improving to 51.89% when using the available device information.