Track, Inpaint, Resplat: Subject-driven 3D and 4D Generation with Progressive Texture Infilling

πŸ“… 2025-10-27
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing 3D/4D generation methods often compromise cross-view semantic consistency to prioritize photorealism and efficiency, while subject-driven personalized generation remains underexplored in the 3D/4D domain. To address this, we propose TIREβ€”a novel framework that jointly leverages video object tracking and diffusion-based subject-aware 2D inpainting to localize and progressively rectify multi-view texture inconsistencies. These corrected 2D images are then conformally reprojected into 3D space via Resplatting and 3D Gaussian Splatting, preserving the original geometry. Crucially, TIRE enhances subject identity consistency across views and improves fine-grained appearance fidelity without geometric modification. Experiments demonstrate that TIRE outperforms state-of-the-art methods on multiple benchmarks, enabling high-quality, highly consistent 3D/4D content generation from only a single or few input images.

Technology Category

Application Category

πŸ“ Abstract
Current 3D/4D generation methods are usually optimized for photorealism, efficiency, and aesthetics. However, they often fail to preserve the semantic identity of the subject across different viewpoints. Adapting generation methods with one or few images of a specific subject (also known as Personalization or Subject-driven generation) allows generating visual content that align with the identity of the subject. However, personalized 3D/4D generation is still largely underexplored. In this work, we introduce TIRE (Track, Inpaint, REsplat), a novel method for subject-driven 3D/4D generation. It takes an initial 3D asset produced by an existing 3D generative model as input and uses video tracking to identify the regions that need to be modified. Then, we adopt a subject-driven 2D inpainting model for progressively infilling the identified regions. Finally, we resplat the modified 2D multi-view observations back to 3D while still maintaining consistency. Extensive experiments demonstrate that our approach significantly improves identity preservation in 3D/4D generation compared to state-of-the-art methods. Our project website is available at https://zsh2000.github.io/track-inpaint-resplat.github.io/.
Problem

Research questions and friction points this paper is trying to address.

Preserving subject identity across 3D/4D viewpoints
Modifying initial 3D assets for semantic consistency
Progressively improving texture with subject-driven inpainting
Innovation

Methods, ideas, or system contributions that make the work stand out.

Tracks regions needing modification via video
Progressively inpaints regions using subject-driven model
Resplats modified 2D views into consistent 3D
πŸ”Ž Similar Papers
No similar papers found.