🤖 AI Summary
This work addresses the highly ill-posed inverse problem of reconstructing hyperspectral images from RGB inputs, where existing methods often suffer from spectral over-smoothing due to inadequate modeling of uncertainty. To overcome this limitation, we propose R2H-Diff, a guided spectral diffusion model that iteratively refines hyperspectral data under RGB conditioning. Our key innovations include a guided spectral refinement module, a hyperspectral-adaptive transposed attention mechanism, a normalization-free U-Net backbone, and a task-tailored linear noise schedule, enabling high-fidelity reconstruction in just five denoising steps. Extensive experiments demonstrate state-of-the-art performance on NTIRE2022, CAVE, and Harvard datasets, achieving 35.37 dB PSNR on NTIRE2022 with only 0.58M parameters and 12.25G FLOPs—setting a new benchmark for efficiency and reconstruction quality.
📝 Abstract
RGB-to-hyperspectral image reconstruction is a highly ill-posed inverse problem, since multiple plausible spectral distributions may correspond to the same RGB observation. Existing regression-based methods usually learn a deterministic mapping, which limits their ability to model reconstruction uncertainty and often leads to over-smoothed spectral responses. Although diffusion models provide strong distribution modeling capability, their direct application to hyperspectral reconstruction remains challenging due to the high spectral dimensionality, strong inter-band correlations, and strict requirement for spectral fidelity. To this end, we propose R2H-Diff, an efficient diffusion-based framework tailored for RGB-to-HSI reconstruction. Specifically, R2H-Diff formulates spectral recovery as a conditional iterative refinement process, enabling progressive reconstruction under RGB guidance. We proposed a Guided Spectral Refinement Module for RGB-conditioned feature fusion and a Hyperspectral-Adaptive Transposed Attention module for efficient spatial--spectral dependency modeling. Furthermore, a normalization-free denoising backbone is adopted to preserve spectral amplitude consistency, while a task-adapted linear noise schedule enables high-quality reconstruction with only five denoising steps. Extensive experiments on NTIRE2022, CAVE, and Harvard demonstrate that R2H-Diff achieves a favorable balance between reconstruction quality and computational efficiency. Notably, on NTIRE2022, R2H-Diff obtains 35.37 dB PSNR with a sub-million-parameter model of 0.58M parameters and 12.25G FLOPs, achieving the lowest model complexity among the evaluated methods while maintaining strong reconstruction fidelity.