Masks make discriminative models great again!

📅 2025-07-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the core challenge of “reconstructing only visible regions” in single-image 3D reconstruction. We propose Image2GS, a novel method that decouples geometric lifting from invisible-region completion. First, we leverage 3D Gaussian Splatting to generate precise visibility masks and formulate image-to-3D geometric lifting as a mask-constrained independent optimization problem. Second, we introduce a mask-aware training strategy that guides the discriminative model to focus exclusively on learning visible geometry. Experiments demonstrate that Image2GS significantly outperforms strong baselines in visible-region reconstruction accuracy, while achieving competitive performance with state-of-the-art discriminative methods under full-scene quantitative evaluation. These results validate the effectiveness and generalizability of the “learn-only-visible” paradigm.

Technology Category

Application Category

📝 Abstract
We present Image2GS, a novel approach that addresses the challenging problem of reconstructing photorealistic 3D scenes from a single image by focusing specifically on the image-to-3D lifting component of the reconstruction process. By decoupling the lifting problem (converting an image to a 3D model representing what is visible) from the completion problem (hallucinating content not present in the input), we create a more deterministic task suitable for discriminative models. Our method employs visibility masks derived from optimized 3D Gaussian splats to exclude areas not visible from the source view during training. This masked training strategy significantly improves reconstruction quality in visible regions compared to strong baselines. Notably, despite being trained only on masked regions, Image2GS remains competitive with state-of-the-art discriminative models trained on full target images when evaluated on complete scenes. Our findings highlight the fundamental struggle discriminative models face when fitting unseen regions and demonstrate the advantages of addressing image-to-3D lifting as a distinct problem with specialized techniques.
Problem

Research questions and friction points this paper is trying to address.

Reconstructing photorealistic 3D scenes from a single image
Decoupling image-to-3D lifting from scene completion
Improving reconstruction quality using visibility masks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses visibility masks for training
Decouples lifting from completion problem
Employs 3D Gaussian splats
🔎 Similar Papers
No similar papers found.