Generative Object Insertion in Gaussian Splatting with a Multi-View Diffusion Model

📅 2024-09-25
🏛️ Visual Informatics
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Existing 3D scene editing methods struggle to insert arbitrary novel objects with high fidelity while preserving geometric and appearance consistency across views. This paper introduces the first generative object insertion framework tailored for Gaussian Splatting representations, pioneering the integration of multi-view consistent diffusion models into 3D editing—enabling end-to-end, mask-free, and fine-tuning-free view-consistent synthesis. Our approach jointly optimizes differentiable Gaussian rendering, cross-view feature alignment loss, and implicit shape-appearance co-modeling. Evaluated on real multi-view imagery, the method generates novel objects that are consistent in illumination, pose, and semantics across all viewpoints. Both qualitative and quantitative evaluations demonstrate superior performance over NeRF-based and conventional editing baselines. Moreover, our method achieves significantly faster inference than generative NeRF approaches, offering a practical, high-fidelity solution for 3D scene composition.

Technology Category

Application Category

Problem

Research questions and friction points this paper is trying to address.

Generating high-quality 3D object insertions in scenes
Ensuring view-consistency in multi-view object inpainting
Refining Gaussian Splatting reconstruction from sparse inpainted views
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-view diffusion model for consistent inpainting
ControlNet-based module for controlled generation
Mask-aware 3D reconstruction for refined results
🔎 Similar Papers
No similar papers found.
H
Hongliang Zhong
Department of Computer Science, City University of Hong Kong, Hong Kong, China
C
Can Wang
Department of Computer Science, City University of Hong Kong, Hong Kong, China
Jingbo Zhang
Jingbo Zhang
City University of Hong Kong
Computer VisionComputer Graphics3D Reconstruction and RenderingNeRF
J
Jing Liao
Department of Computer Science, City University of Hong Kong, Hong Kong, China