InstaInpaint: Instant 3D-Scene Inpainting with Masked Large Reconstruction Model

📅 2025-06-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing 3D scene inpainting methods rely on iterative optimization, failing to meet real-time requirements. This paper proposes a millisecond-level feed-forward 3D inpainting framework built upon a customized Large Reconstruction Model (LRM). We introduce the first reference-guided feed-forward architecture and a self-supervised masked fine-tuning strategy, enabling end-to-end geometrically consistent and texturally coherent reconstruction. Our method integrates masked reconstruction pretraining, joint geometric-textural constraints, and lightweight inference design. Given only a 2D mask, it generates a complete 3D scene in just 0.4 seconds—supporting interactive AR/VR applications such as multi-region editing and object insertion. Compared to state-of-the-art optimization-based approaches, our method achieves a 1000× speedup while maintaining top-tier performance on two major benchmarks. Extensive experiments validate its strong generalization capability and practical real-time usability in real-world scenarios.

Technology Category

Application Category

📝 Abstract
Recent advances in 3D scene reconstruction enable real-time viewing in virtual and augmented reality. To support interactive operations for better immersiveness, such as moving or editing objects, 3D scene inpainting methods are proposed to repair or complete the altered geometry. However, current approaches rely on lengthy and computationally intensive optimization, making them impractical for real-time or online applications. We propose InstaInpaint, a reference-based feed-forward framework that produces 3D-scene inpainting from a 2D inpainting proposal within 0.4 seconds. We develop a self-supervised masked-finetuning strategy to enable training of our custom large reconstruction model (LRM) on the large-scale dataset. Through extensive experiments, we analyze and identify several key designs that improve generalization, textural consistency, and geometric correctness. InstaInpaint achieves a 1000x speed-up from prior methods while maintaining a state-of-the-art performance across two standard benchmarks. Moreover, we show that InstaInpaint generalizes well to flexible downstream applications such as object insertion and multi-region inpainting. More video results are available at our project page: https://dhmbb2.github.io/InstaInpaint_page/.
Problem

Research questions and friction points this paper is trying to address.

Real-time 3D scene inpainting for interactive operations
Overcoming slow optimization in current inpainting methods
Enhancing generalization and consistency in 3D reconstruction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reference-based feed-forward 3D inpainting framework
Self-supervised masked-finetuning for large reconstruction model
1000x speed-up with state-of-the-art performance
🔎 Similar Papers
No similar papers found.