🤖 AI Summary
This work addresses two critical limitations in existing 3D inpainting methods: poor robustness under large viewpoint variations in single-view approaches, and appearance/geometry inconsistency in multi-view diffusion-based repair. We propose the first unified 3D inpainting framework supporting object removal, retexuring, and replacement. Methodologically, we introduce a multi-reference view adaptive selection strategy, an attention-based feature propagation (AFP) mechanism to jointly optimize geometry and texture across views, and a texture-geometry joint score distillation sampling (TG-SDS) loss that explicitly enforces consistency between reconstructed geometry and surface appearance within repaired regions. Experiments demonstrate substantial improvements in cross-view consistency and robustness to large-angle reconstruction, effectively mitigating geometric distortion and visual incoherence—particularly under significant structural changes—where prior methods fail.
📝 Abstract
Developing a unified pipeline that enables users to remove, re-texture, or replace objects in a versatile manner is crucial for text-guided 3D inpainting. However, there are still challenges in performing multiple 3D inpainting tasks within a unified framework: 1) Single reference inpainting methods lack robustness when dealing with views that are far from the reference view. 2) Appearance inconsistency arises when independently inpainting multi-view images with 2D diffusion priors; 3) Geometry inconsistency limits performance when there are significant geometric changes in the inpainting regions. To tackle these challenges, we introduce DiGA3D, a novel and versatile 3D inpainting pipeline that leverages diffusion models to propagate consistent appearance and geometry in a coarse-to-fine manner. First, DiGA3D develops a robust strategy for selecting multiple reference views to reduce errors during propagation. Next, DiGA3D designs an Attention Feature Propagation (AFP) mechanism that propagates attention features from the selected reference views to other views via diffusion models to maintain appearance consistency. Furthermore, DiGA3D introduces a Texture-Geometry Score Distillation Sampling (TG-SDS) loss to further improve the geometric consistency of inpainted 3D scenes. Extensive experiments on multiple 3D inpainting tasks demonstrate the effectiveness of our method. The project page is available at https://rorisis.github.io/DiGA3D/.