π€ AI Summary
This work addresses the challenge of high-fidelity, fine-grained controllable bandwidth extension for music audio degraded by limited bandwidth in historical archivesβa task where existing generative models struggle to achieve both quality and precise control. The authors propose a single-step controllable bandwidth extension method based on Flow Matching, introducing for the first time a dynamic spectral contour (DSC) as a fine-grained conditioning signal. By integrating classifier-free guidance with DSC, the model enables accurate audio restoration with enhanced fidelity and controllability. Experimental results demonstrate that the proposed approach significantly outperforms prior methods in both perceptual quality and controllability, with DSC effectively facilitating high-precision conditional generation and establishing state-of-the-art performance in bandwidth extension tasks.
π Abstract
Audio restoration consists in inverting degradations of a digital audio signal to recover what would have been the pristine quality signal before the degradation occurred. This is valuable in contexts such as archives of music recordings, particularly those of precious historical value, for which a clean version may have been lost or simply does not exist. Recent work applied generative models to audio restoration, showing promising improvement over previous methods, and opening the door to the ability to perform restoration operations that were not possible before. However, making these models finely controllable remains a challenge. In this paper, we propose an extension of FLowHigh and introduce the Dynamic Spectral Contour (DSC) as a control signal for bandwidth extension via classifier-free guidance. Our experiments show competitive model performance, and indicate that DSC is a promising feature to support fine-grained conditioning.