Seen2Scene: Completing Realistic 3D Scenes with Visibility-Guided Flow

πŸ“… 2026-03-30
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the limited generalizability of existing scene completion methods, which rely on synthetic, fully observed 3D data and struggle with real-world incomplete scans. The authors propose an end-to-end scene completion framework based on flow matching, trained directly on real, partial 3D scans for the first time. Their approach employs a visibility-guided mechanism to explicitly mask unknown regions and leverages a sparse TSDF grid representation combined with a sparse Transformer to model complex geometric structures. This design flexibly accommodates diverse conditioning inputs, including layout bounding boxes, textual descriptions, or partial scans. Evaluated on complex real-world scenes, the method generates coherent, complete, and realistic 3D geometry, significantly outperforming current baselines in both completion accuracy and generation quality.
πŸ“ Abstract
We present Seen2Scene, the first flow matching-based approach that trains directly on incomplete, real-world 3D scans for scene completion and generation. Unlike prior methods that rely on complete and hence synthetic 3D data, our approach introduces visibility-guided flow matching, which explicitly masks out unknown regions in real scans, enabling effective learning from real-world, partial observations. We represent 3D scenes using truncated signed distance field (TSDF) volumes encoded in sparse grids and employ a sparse transformer to efficiently model complex scene structures while masking unknown regions. We employ 3D layout boxes as an input conditioning signal, and our approach is flexibly adapted to various other inputs such as text or partial scans. By learning directly from real-world, incomplete 3D scans, Seen2Scene enables realistic 3D scene completion for complex, cluttered real environments. Experiments demonstrate that our model produces coherent, complete, and realistic 3D scenes, outperforming baselines in completion accuracy and generation quality.
Problem

Research questions and friction points this paper is trying to address.

3D scene completion
real-world 3D scans
incomplete data
scene generation
visibility-aware modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

visibility-guided flow matching
scene completion
real-world 3D scans
sparse transformer
TSDF
πŸ”Ž Similar Papers
No similar papers found.