🤖 AI Summary
This work proposes a teacher-student framework for single-view 3D Gaussian splatting to address the inherent challenges of scale ambiguity and limited extrapolation in monocular 3D scene reconstruction. By integrating the teacher-student paradigm with 3D Gaussian splatting for the first time, the method leverages a multi-view teacher model to provide geometric supervision and introduces an extrapolation network to recover missing contextual information. The proposed approach substantially mitigates scale ambiguity, enhances out-of-view synthesis capabilities, and achieves state-of-the-art performance in novel view synthesis from a single input image. Moreover, it attains scene-level reconstruction quality comparable to multi-view methods and demonstrates strong results in self-supervised monocular depth estimation.
📝 Abstract
Recent advance in feed-forward 3D Gaussian splatting has enable remarkable multi-view 3D scene reconstruction or single-view 3D object reconstruction but single-view 3D scene reconstruction remain under-explored due to inherited ambiguity in single-view. We present \textbf{studentSplat}, a single-view 3D Gaussian splatting method for scene reconstruction. To overcome the scale ambiguity and extrapolation problems inherent in novel-view supervision from a single input, we introduce two techniques: 1) a teacher-student architecture where a multi-view teacher model provides geometric supervision to the single-view student during training, addressing scale ambiguity and encourage geometric validity; and 2) an extrapolation network that completes missing scene context, enabling high-quality extrapolation. Extensive experiments show studentSplat achieves state-of-the-art single-view novel-view reconstruction quality and comparable performance to multi-view methods at the scene level. Furthermore, studentSplat demonstrates competitive performance as a self-supervised single-view depth estimation method, highlighting its potential for general single-view 3D understanding tasks.