🤖 AI Summary
Traditional deep image priors (DIP) rely on pixel-wise losses and early stopping to mitigate overfitting, resulting in limited generalization and instability. This work proposes the Deep Spectral Prior (DSP), reformulating image reconstruction as a spectral alignment problem: instead of optimizing in the spatial domain, DSP directly matches the Fourier coefficients of the network output to those of the observed data, eliminating pixel-wise losses and early stopping. We introduce, for the first time, an implicit spectral regularization mechanism—rigorously proven to suppress high-frequency noise, ensure convergence, enhance stability, and achieve superior bias-variance trade-offs. Our CNN-based framework operates entirely in the frequency domain, integrating fast Fourier transforms and a spectral alignment loss. Extensive experiments demonstrate that DSP significantly outperforms DIP and other unsupervised methods across denoising, inpainting, and super-resolution tasks, yielding higher reconstruction fidelity and robustness.
📝 Abstract
We introduce Deep Spectral Prior (DSP), a new formulation of Deep Image Prior (DIP) that redefines image reconstruction as a frequency-domain alignment problem. Unlike traditional DIP, which relies on pixel-wise loss and early stopping to mitigate overfitting, DSP directly matches Fourier coefficients between the network output and observed measurements. This shift introduces an explicit inductive bias towards spectral coherence, aligning with the known frequency structure of images and the spectral bias of convolutional neural networks. We provide a rigorous theoretical framework demonstrating that DSP acts as an implicit spectral regulariser, suppressing high-frequency noise by design and eliminating the need for early stopping. Our analysis spans four core dimensions establishing smooth convergence dynamics, local stability, and favourable bias-variance tradeoffs. We further show that DSP naturally projects reconstructions onto a frequency-consistent manifold, enhancing interpretability and robustness. These theoretical guarantees are supported by empirical results across denoising, inpainting, and super-resolution tasks, where DSP consistently outperforms classical DIP and other unsupervised baselines.