🤖 AI Summary
This work addresses the limitations of existing spectral compressive imaging methods, which are predominantly confined to single-frame reconstruction and struggle to recover spatial-spectral information occluded by coding masks while lacking temporal consistency across video sequences. To overcome these challenges, we introduce DynaSpec, the first dynamic hyperspectral video dataset, and propose the Propagation-Guided Spectral Video Reconstruction Transformer (PG-SVRT). By leveraging a spatio-temporal attention mechanism and a novel bridging token design, PG-SVRT effectively integrates complementary information from adjacent frames. This approach significantly enhances reconstruction quality, spectral fidelity, and temporal consistency, all while maintaining extremely low computational overhead (FLOPs), thereby establishing a new benchmark for video-level spectral compressive imaging.
📝 Abstract
Recently, Spectral Compressive Imaging (SCI) has achieved remarkable success, unlocking significant potential for dynamic spectral vision. However, existing reconstruction methods, primarily image-based, suffer from two limitations: (i) Encoding process masks spatial-spectral features, leading to uncertainty in reconstructing missing information from single compressed measurements, and (ii) The frame-by-frame reconstruction paradigm fails to ensure temporal consistency, which is crucial in the video perception. To address these challenges, this paper seeks to advance spectral reconstruction from the image level to the video level, leveraging the complementary features and temporal continuity across adjacent frames in dynamic scenes. Initially, we construct the first high-quality dynamic hyperspectral image dataset (DynaSpec), comprising 30 sequences obtained through frame-scanning acquisition. Subsequently, we propose the Propagation-Guided Spectral Video Reconstruction Transformer (PG-SVRT), which employs a spatial-then-temporal attention to effectively reconstruct spectral features from abundant video information, while using a bridged token to reduce computational complexity. Finally, we conduct simulation experiments to assess the performance of four SCI systems, and construct a DD-CASSI prototype for real-world data collection and benchmarking. Extensive experiments demonstrate that PG-SVRT achieves superior performance in reconstruction quality, spectral fidelity, and temporal consistency, while maintaining minimal FLOPs. Project page: https://github.com/nju-cite/DynaSpec