🤖 AI Summary
This work addresses the task of restoring original instrument stems from professionally mixed audio subjected to realistic degradation, establishing the first standardized music source restoration challenge. We introduce a real-world dataset, define a unified evaluation protocol, and provide baseline models. Assessment combines objective metrics—Multi-Mel-SNR, Zimtohrli, and FAD-CLAP—with subjective listening tests, revealing substantial differences in restorability across instruments: bass achieves 4.59 dB gain in Multi-Mel-SNR, whereas percussion shows only 0.29 dB. The winning system attains 4.46 dB (MOS 3.47) on Multi-Mel-SNR, outperforming the runner-up by 91% and 18% in objective and subjective scores, respectively, thereby demonstrating both the efficacy of the proposed approach and the practical relevance of the challenge.
📝 Abstract
Music Source Restoration (MSR) aims to recover original, unprocessed instrument stems from professionally mixed and degraded audio, requiring the reversal of both production effects and real-world degradations. We present the inaugural MSR Challenge, which features objective evaluation on studio-produced mixtures using Multi-Mel-SNR, Zimtohrli, and FAD-CLAP, alongside subjective evaluation on real-world degraded recordings. Five teams participated in the challenge. The winning system achieved 4.46 dB Multi-Mel-SNR and 3.47 MOS-Overall, corresponding to relative improvements of 91% and 18% over the second-place system, respectively. Per-stem analysis reveals substantial variation in restoration difficulty across instruments, with bass averaging 4.59 dB across all teams, while percussion averages only 0.29 dB. The dataset, evaluation protocols, and baselines are available at https://msrchallenge.com/.