GarmentCrafter: Progressive Novel View Synthesis for Single-View 3D Garment Reconstruction and Editing

📅 2025-03-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenging problem of generating geometrically accurate and multi-view consistent 3D editable garment models from a single input image. We propose a progressive depth-guided multi-view diffusion modeling framework that jointly leverages single-image depth estimation, differentiable image warping, and RGB-depth co-inference. A deformation field explicitly encodes garment geometry priors, guiding a multi-view conditional diffusion model to reconstruct both texture and geometry in a coordinated manner. Our key contribution is the first integration of image warping as an explicit geometric constraint into the diffusion process, enabling end-to-end generation of multi-view consistent 3D garments from a single image. Experiments demonstrate significant improvements over state-of-the-art methods in visual fidelity, structural accuracy, and cross-view consistency. Moreover, the generated models support intuitive post-hoc editing, making the approach accessible to non-expert users.

Technology Category

Application Category

📝 Abstract
We introduce GarmentCrafter, a new approach that enables non-professional users to create and modify 3D garments from a single-view image. While recent advances in image generation have facilitated 2D garment design, creating and editing 3D garments remains challenging for non-professional users. Existing methods for single-view 3D reconstruction often rely on pre-trained generative models to synthesize novel views conditioning on the reference image and camera pose, yet they lack cross-view consistency, failing to capture the internal relationships across different views. In this paper, we tackle this challenge through progressive depth prediction and image warping to approximate novel views. Subsequently, we train a multi-view diffusion model to complete occluded and unknown clothing regions, informed by the evolving camera pose. By jointly inferring RGB and depth, GarmentCrafter enforces inter-view coherence and reconstructs precise geometries and fine details. Extensive experiments demonstrate that our method achieves superior visual fidelity and inter-view coherence compared to state-of-the-art single-view 3D garment reconstruction methods.
Problem

Research questions and friction points this paper is trying to address.

Enables 3D garment creation from single-view images
Improves cross-view consistency in 3D garment reconstruction
Reconstructs precise geometries and fine details in garments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Progressive depth prediction for novel views
Multi-view diffusion model for occlusion completion
Joint RGB and depth inference for coherence
🔎 Similar Papers
No similar papers found.
Yuanhao Wang
Yuanhao Wang
Princeton University
C
Cheng Zhang
Texas A&M University
G
Gonccalo Frazao
Carnegie Mellon University
J
Jinlong Yang
Google AR
A
A. Ichim
Google AR
T
T. Beeler
Google AR
Fernando De la Torre
Fernando De la Torre
Research Associate Professor, Carnegie Mellon University
Pattern RecognitionComputer VisionAugmented RealityVirtual Reality