Zero-Shot Reconstruction of Animatable 3D Avatars with Cloth Dynamics from a Single Image

📅 2026-03-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing single-image 3D human reconstruction methods, which rely on rigid joint transformations and struggle to capture realistic clothing dynamics. To overcome this, we propose DynaAvatar, a novel framework that, for the first time, enables zero-shot generation of animatable 3D human avatars from a single image while accurately recovering motion-induced clothing deformations. Built upon a Transformer-based feedforward architecture, DynaAvatar directly predicts dynamic 3D Gaussian deformations without requiring test-time optimization. The method leverages static-to-dynamic knowledge transfer, lightweight LoRA fine-tuning, a DynaFlow optical flow-guided loss, and an SMPL-X re-annotation strategy to significantly enhance the fidelity of dynamic clothing modeling. Experiments demonstrate that DynaAvatar substantially outperforms current approaches in both visual realism and generalization capability.

Technology Category

Application Category

📝 Abstract
Existing single-image 3D human avatar methods primarily rely on rigid joint transformations, limiting their ability to model realistic cloth dynamics. We present DynaAvatar, a zero-shot framework that reconstructs animatable 3D human avatars with motion-dependent cloth dynamics from a single image. Trained on large-scale multi-person motion datasets, DynaAvatar employs a Transformer-based feed-forward architecture that directly predicts dynamic 3D Gaussian deformations without subject-specific optimization. To overcome the scarcity of dynamic captures, we introduce a static-to-dynamic knowledge transfer strategy: a Transformer pretrained on large-scale static captures provides strong geometric and appearance priors, which are efficiently adapted to motion-dependent deformations through lightweight LoRA fine-tuning on dynamic captures. We further propose the DynaFlow loss, an optical flow-guided objective that provides reliable motion-direction geometric cues for cloth dynamics in rendered space. Finally, we reannotate the missing or noisy SMPL-X fittings in existing dynamic capture datasets, as most public dynamic capture datasets contain incomplete or unreliable fittings that are unsuitable for training high-quality 3D avatar reconstruction models. Experiments demonstrate that DynaAvatar produces visually rich and generalizable animations, outperforming prior methods.
Problem

Research questions and friction points this paper is trying to address.

cloth dynamics
3D avatar
single-image reconstruction
animatable avatar
zero-shot
Innovation

Methods, ideas, or system contributions that make the work stand out.

zero-shot avatar reconstruction
cloth dynamics
Transformer-based deformation
static-to-dynamic knowledge transfer
DynaFlow loss
🔎 Similar Papers
No similar papers found.