🤖 AI Summary
This work addresses the challenge of structure-from-motion (SfM) in extremely low-light conditions, where conventional methods fail due to unreliable feature extraction at signal-to-noise ratios below −4 dB. The authors propose a teacher–student distillation framework that transfers knowledge from large-scale 3D foundation models to ultra-low-light scenarios without requiring 3D supervision, relying solely on paired noisy and clean raw images for training. To enable this, they synthesize training data using Poisson–Gaussian noise modeling, employ a coarse-to-fine radiance field optimization strategy, and introduce the first exposure-bracketed raw image dataset tailored for low-light SfM and novel view synthesis. This approach achieves state-of-the-art performance in both tasks, demonstrating robust SfM reconstruction from raw images under extremely low SNR—without any ground-truth 3D annotations.
📝 Abstract
We introduce Dark3R, a framework for structure from motion in the dark that operates directly on raw images with signal-to-noise ratios (SNRs) below $-4$ dB -- a regime where conventional feature- and learning-based methods break down. Our key insight is to adapt large-scale 3D foundation models to extreme low-light conditions through a teacher--student distillation process, enabling robust feature matching and camera pose estimation in low light. Dark3R requires no 3D supervision; it is trained solely on noisy--clean raw image pairs, which can be either captured directly or synthesized using a simple Poisson--Gaussian noise model applied to well-exposed raw images. To train and evaluate our approach, we introduce a new, exposure-bracketed dataset that includes $\sim$42,000 multi-view raw images with ground-truth 3D annotations, and we demonstrate that Dark3R achieves state-of-the-art structure from motion in the low-SNR regime. Further, we demonstrate state-of-the-art novel view synthesis in the dark using Dark3R's predicted poses and a coarse-to-fine radiance field optimization procedure.