Articulated Kinematics Distillation from Video Diffusion Models

📅 2025-04-01

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

This work addresses three key challenges in text-driven 4D character animation generation: low motion quality, structural inconsistency, and physically implausible dynamics. To this end, we propose a joint-level motion distillation framework. Methodologically, we distill 3D kinematic priors from a pre-trained video diffusion model and integrate them with skeleton-driven representation and Score Distillation Sampling (SDS), enabling joint modeling of expressive motion and skeletal topology consistency. The resulting joint trajectories are inherently compatible with physics-based simulation, circumventing the intrinsic shape instability issues inherent to 4D neural deformation fields. Experiments demonstrate that our approach significantly improves both 3D structural consistency and motion fidelity in text-to-4D generation, achieving state-of-the-art performance across quantitative and qualitative evaluations.

Technology Category

Application Category

📝 Abstract

We present Articulated Kinematics Distillation (AKD), a framework for generating high-fidelity character animations by merging the strengths of skeleton-based animation and modern generative models. AKD uses a skeleton-based representation for rigged 3D assets, drastically reducing the Degrees of Freedom (DoFs) by focusing on joint-level control, which allows for efficient, consistent motion synthesis. Through Score Distillation Sampling (SDS) with pre-trained video diffusion models, AKD distills complex, articulated motions while maintaining structural integrity, overcoming challenges faced by 4D neural deformation fields in preserving shape consistency. This approach is naturally compatible with physics-based simulation, ensuring physically plausible interactions. Experiments show that AKD achieves superior 3D consistency and motion quality compared with existing works on text-to-4D generation. Project page: https://research.nvidia.com/labs/dir/akd/

Problem

Research questions and friction points this paper is trying to address.

Generates high-fidelity character animations using skeleton-based and generative models

Reduces motion complexity by focusing on joint-level control

Ensures structural integrity and physical plausibility in articulated motions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Skeleton-based representation reduces Degrees of Freedom

Score Distillation Sampling with video diffusion models

Ensures physically plausible interactions via physics simulation

🔎 Similar Papers

DiffMesh: A Motion-aware Diffusion Framework for Human Mesh Recovery from Videos