🤖 AI Summary
In snapshot compressive imaging (SCI), reconstructing high-fidelity multispectral images (MSI) from a single 2D measurement is an ill-posed inverse problem. Existing diffusion-based methods suffer from scarce labeled MSI data, domain shift induced by RGB pretraining, and low efficiency due to multi-step sampling. This paper proposes the first single-step diffusion refinement framework: it adopts a self-supervised equivariant imaging paradigm to eliminate reliance on ground-truth labels; introduces a lightweight single-step diffusion model that directly predicts high-frequency residuals, substantially improving fine-detail recovery; and maintains plug-and-play compatibility with diverse end-to-end or unrolled reconstruction networks. Experiments demonstrate that our method consistently outperforms state-of-the-art approaches across multi-scale structural fidelity, cross-dataset generalization, and inference speed—achieving superior simplicity, robustness, and modularity without compromising reconstruction quality.
📝 Abstract
Coded Aperture Snapshot Spectral Imaging (CASSI) is a crucial technique for capturing three-dimensional multispectral images (MSIs) through the complex inverse task of reconstructing these images from coded two-dimensional measurements. Current state-of-the-art methods, predominantly end-to-end, face limitations in reconstructing high-frequency details and often rely on constrained datasets like KAIST and CAVE, resulting in models with poor generalizability. In response to these challenges, this paper introduces a novel one-step Diffusion Probabilistic Model within a self-supervised adaptation framework for Snapshot Compressive Imaging (SCI). Our approach leverages a pretrained SCI reconstruction network to generate initial predictions from two-dimensional measurements. Subsequently, a one-step diffusion model produces high-frequency residuals to enhance these initial predictions. Additionally, acknowledging the high costs associated with collecting MSIs, we develop a self-supervised paradigm based on the Equivariant Imaging (EI) framework. Experimental results validate the superiority of our model compared to previous methods, showcasing its simplicity and adaptability to various end-to-end or unfolding techniques.