🤖 AI Summary
Traditional blind super-resolution (SR) methods heavily rely on strong prior assumptions about the degradation kernel—e.g., bicubic or isotropic Gaussian kernels—and suffer severe performance degradation under complex, unknown degradations (e.g., anisotropic, non-Gaussian, or out-of-distribution kernels). To address this, we propose KernelFusion, the first zero-shot, single-image blind SR method that imposes *no prior constraints* on kernel structure. Its core innovation lies in constructing a zero-shot diffusion model grounded solely on internal patch statistics of the input low-resolution image, enabling joint optimization of an image-specific SR kernel and the high-resolution (HR) reconstruction. Crucially, cross-scale patch similarity serves as an implicit, differentiable kernel criterion, facilitating end-to-end co-estimation of the kernel and HR output. Extensive experiments demonstrate that KernelFusion significantly outperforms state-of-the-art methods across diverse unknown degradations, establishing a truly assumption-free paradigm for blind super-resolution.
📝 Abstract
Traditional super-resolution (SR) methods assume an ``ideal'' downscaling SR-kernel (e.g., bicubic downscaling) between the high-resolution (HR) image and the low-resolution (LR) image. Such methods fail once the LR images are generated differently. Current blind-SR methods aim to remove this assumption, but are still fundamentally restricted to rather simplistic downscaling SR-kernels (e.g., anisotropic Gaussian kernels), and fail on more complex (out of distribution) downscaling degradations. However, using the correct SR-kernel is often more important than using a sophisticated SR algorithm. In ``KernelFusion'' we introduce a zero-shot diffusion-based method that makes no assumptions about the kernel. Our method recovers the unique image-specific SR-kernel directly from the LR input image, while simultaneously recovering its corresponding HR image. KernelFusion exploits the principle that the correct SR-kernel is the one that maximizes patch similarity across different scales of the LR image. We first train an image-specific patch-based diffusion model on the single LR input image, capturing its unique internal patch statistics. We then reconstruct a larger HR image with the same learned patch distribution, while simultaneously recovering the correct downscaling SR-kernel that maintains this cross-scale relation between the HR and LR images. Empirical results show that KernelFusion vastly outperforms all SR baselines on complex downscaling degradations, where existing SotA Blind-SR methods fail miserably. By breaking free from predefined kernel assumptions, KernelFusion pushes Blind-SR into a new assumption-free paradigm, handling downscaling kernels previously thought impossible.