🤖 AI Summary
This work addresses the severely ill-posed problem of fusing low-resolution mosaic hyperspectral images with high-resolution panchromatic images by proposing a semi-supervised flow-matching framework capable of video-level, single-shot high-resolution hyperspectral reconstruction. The method employs a two-stage training strategy: an unsupervised prior network first generates initial reconstructions, which are subsequently refined through an iterative optimization process combining a conditional flow-matching model with a stochastic voting mechanism. During inference, a conflict-free gradient guidance strategy is introduced to preserve both spatial and spectral consistency. Unlike conventional diffusion-based approaches, the proposed framework eliminates reliance on handcrafted assumptions or dataset-specific protocols, offering generality, efficiency, and scalability while flexibly accommodating other unsupervised or blind restoration algorithms. Extensive experiments demonstrate that the method significantly outperforms state-of-the-art techniques across multiple benchmark datasets, achieving leading performance in both quantitative metrics and visual quality.
📝 Abstract
Fusing a low resolution (LR) mosaiced hyperspectral image (HSI) with a high resolution (HR) panchromatic (PAN) image offers a promising avenue for video-rate HR-HSI imaging via single-shot acquisition, yet its severely ill-posed nature remains a significant challenge. In this work, we propose a novel semi-supervised flow matching framework for mosaiced and PAN image fusion. Unlike previous diffusion-based approaches constrained by specific protocols or handcrafted assumptions, our method seamlessly integrates an unsupervised scheme with flow matching, resulting in a generalizable and efficient generative framework. Specifically, our method follows a two-stage training pipeline. First, we pretrain an unsupervised prior network to produce an initial pseudo HR-HSI. Building on this, we then train a conditional flow matching model to generate the target HR-HSI, introducing a random voting mechanism that iteratively refines the initial HR-HSI estimate, enabling robust and effective fusion. During inference, we employ a conflict-free gradient guidance strategy that ensures spectrally and spatially consistent HR-HSI reconstruction. Experiments on multiple benchmark datasets demonstrate that our method achieves superior quantitative and qualitative performance by a significant margin compared to representative baselines. Beyond mosaiced and PAN fusion, our approach provides a flexible generative framework that can be readily extended to other image fusion tasks and integrated with unsupervised or blind image restoration algorithms.