🤖 AI Summary
This work addresses the challenge of multispectral demosaicing—reconstructing full-resolution spectral images from a single snapshot of mosaic measurements—where conventional methods often produce blurry results and supervised learning approaches rely on costly ground-truth data. To overcome these limitations, we propose the PEFD framework, which, for the first time, enables effective transfer of pretrained foundation models to multispectral demosaicing without requiring ground-truth supervision. PEFD integrates camera projection geometry modeling with viewpoint-equivariant fine-tuning and leverages group-theoretic structures to recover null-space information, achieving high-quality unsupervised reconstruction. Experiments on neurosurgical and autonomous driving datasets demonstrate that PEFD significantly outperforms existing unsupervised methods, yielding sharp details (e.g., blood vessels) and high spectral fidelity, with performance approaching that of supervised approaches.
📝 Abstract
Multispectral demosaicing is crucial to reconstruct full-resolution spectral images from snapshot mosaiced measurements, enabling real-time imaging from neurosurgery to autonomous driving. Classical methods are blurry, while supervised learning requires costly ground truth (GT) obtained from slow line-scanning systems. We propose Perspective-Equivariant Fine-tuning for Demosaicing (PEFD), a framework that learns multispectral demosaicing from mosaiced measurements alone. PEFD a) exploits the projective geometry of camera-based imaging systems to leverage a richer group structure than previous demosaicing methods to recover more null-space information, and b) learns efficiently without GT by adapting pretrained foundation models designed for 1-3 channel imaging. On intraoperative and automotive datasets, PEFD recovers fine details such as blood vessels and preserves spectral fidelity, substantially outperforming recent approaches, nearing supervised performance.