🤖 AI Summary
Acoustic scene classification (ASC) suffers from domain shift induced by device heterogeneity, leading to substantial performance degradation under label scarcity. Conventional curriculum learning approaches employ static sample ordering or weighting schemes, failing to adapt to the dynamic evolution of sample difficulty and marginal utility during training. To address this, we propose a dynamic dual-signal curriculum learning framework that jointly models two time-varying signals: domain invariance—capturing cross-device commonalities—and learning progress—reflecting sample difficulty and model convergence status. These signals jointly inform real-time sample weighting: early training emphasizes domain-invariant features, while device-specific modeling is progressively incorporated later. Evaluated on the DCASE 2024 Task 1 benchmark, our method significantly improves cross-device classification accuracy across diverse labeling budgets and demonstrates superior generalization to unseen devices, validating the efficacy of data-efficient, adaptive curriculum design.
📝 Abstract
Acoustic scene classification (ASC) suffers from device-induced domain shift, especially when labels are limited. Prior work focuses on curriculum-based training schedules that structure data presentation by ordering or reweighting training examples from easy-to-hard to facilitate learning; however, existing curricula are static, fixing the ordering or the weights before training and ignoring that example difficulty and marginal utility evolve with the learned representation. To overcome this limitation, we propose the Dynamic Dual-Signal Curriculum (DDSC), a training schedule that adapts the curriculum online by combining two signals computed each epoch: a domain-invariance signal and a learning-progress signal. A time-varying scheduler fuses these signals into per-example weights that prioritize domain-invariant examples in early epochs and progressively emphasize device-specific cases. DDSC is lightweight, architecture-agnostic, and introduces no additional inference overhead. Under the official DCASE 2024 Task~1 protocol, DDSC consistently improves cross-device performance across diverse ASC baselines and label budgets, with the largest gains on unseen-device splits.