SuperF: Neural Implicit Fields for Multi-Image Super-Resolution

📅 2025-12-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
High-resolution (HR) image acquisition is constrained by sensor limitations, atmospheric turbulence, and cost; single-image super-resolution (SISR) often introduces structural “hallucinations” due to strong priors. While multi-image super-resolution (MISR) leverages subpixel-shifted low-resolution (LR) views to improve reconstruction fidelity, existing unsupervised methods struggle to jointly ensure geometric consistency and detail authenticity. This paper proposes the first unsupervised MISR framework that jointly optimizes a shared implicit neural representation (INR) and learnable affine alignment parameters. Built upon coordinate-based neural networks, it performs end-to-end alignment and reconstruction at test time via supersampled coordinate grids and differentiable affine estimation—requiring no HR ground truth. Our method achieves up to 8× upsampling on satellite and smartphone imagery, significantly enhancing both perceptual detail fidelity and cross-view geometric consistency, outperforming state-of-the-art unsupervised MISR approaches.

Technology Category

Application Category

📝 Abstract
High-resolution imagery is often hindered by limitations in sensor technology, atmospheric conditions, and costs. Such challenges occur in satellite remote sensing, but also with handheld cameras, such as our smartphones. Hence, super-resolution aims to enhance the image resolution algorithmically. Since single-image super-resolution requires solving an inverse problem, such methods must exploit strong priors, e.g. learned from high-resolution training data, or be constrained by auxiliary data, e.g. by a high-resolution guide from another modality. While qualitatively pleasing, such approaches often lead to "hallucinated" structures that do not match reality. In contrast, multi-image super-resolution (MISR) aims to improve the (optical) resolution by constraining the super-resolution process with multiple views taken with sub-pixel shifts. Here, we propose SuperF, a test-time optimization approach for MISR that leverages coordinate-based neural networks, also called neural fields. Their ability to represent continuous signals with an implicit neural representation (INR) makes them an ideal fit for the MISR task. The key characteristic of our approach is to share an INR for multiple shifted low-resolution frames and to jointly optimize the frame alignment with the INR. Our approach advances related INR baselines, adopted from burst fusion for layer separation, by directly parameterizing the sub-pixel alignment as optimizable affine transformation parameters and by optimizing via a super-sampled coordinate grid that corresponds to the output resolution. Our experiments yield compelling results on simulated bursts of satellite imagery and ground-level images from handheld cameras, with upsampling factors of up to 8. A key advantage of SuperF is that this approach does not rely on any high-resolution training data.
Problem

Research questions and friction points this paper is trying to address.

Enhances image resolution using multiple low-resolution views
Optimizes neural fields for multi-image super-resolution alignment
Eliminates need for high-resolution training data in super-resolution
Innovation

Methods, ideas, or system contributions that make the work stand out.

Neural implicit fields for multi-image super-resolution
Joint optimization of frame alignment and neural representation
No reliance on high-resolution training data
🔎 Similar Papers
No similar papers found.