SMF: Template-free and Rig-free Animation Transfer using Kinetic Codes

📅 2025-04-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing animation retargeting methods rely on templates, rigged skeletons, or annotated data, suffering from poor generalization and motion jitter. This paper proposes the first fully self-supervised, universal animation transfer framework: it requires no templates, skeletons, or human annotations, and takes only sparse motion signals (2D/3D keypoint sequences) as input to robustly transfer motions onto arbitrary mesh characters—regardless of topology or geometry. Our core innovation is Kinetic Codes: semantically rich motion latent representations learned via an autoencoder, coupled with a spatiotemporal gradient prediction network for end-to-end motion reconstruction. Evaluated on multi-source datasets including AMASS, our method achieves state-of-the-art generalization performance, significantly improving adaptability to unseen motions and diverse characters—including non-human topologies—while effectively suppressing motion jitter.

Technology Category

Application Category

📝 Abstract
Animation retargeting involves applying a sparse motion description (e.g., 2D/3D keypoint sequences) to a given character mesh to produce a semantically plausible and temporally coherent full-body motion. Existing approaches come with a mix of restrictions - they require annotated training data, assume access to template-based shape priors or artist-designed deformation rigs, suffer from limited generalization to unseen motion and/or shapes, or exhibit motion jitter. We propose Self-supervised Motion Fields (SMF) as a self-supervised framework that can be robustly trained with sparse motion representations, without requiring dataset specific annotations, templates, or rigs. At the heart of our method are Kinetic Codes, a novel autoencoder-based sparse motion encoding, that exposes a semantically rich latent space simplifying large-scale training. Our architecture comprises dedicated spatial and temporal gradient predictors, which are trained end-to-end. The resultant network, regularized by the Kinetic Codes's latent space, has good generalization across shapes and motions. We evaluated our method on unseen motion sampled from AMASS, D4D, Mixamo, and raw monocular video for animation transfer on various characters with varying shapes and topology. We report a new SoTA on the AMASS dataset in the context of generalization to unseen motion. Project webpage at https://motionfields.github.io/
Problem

Research questions and friction points this paper is trying to address.

Transfer sparse motion to character meshes without templates or rigs
Overcome limitations of existing methods in generalization and motion jitter
Enable robust training with self-supervised sparse motion representations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised Motion Fields framework
Kinetic Codes for sparse motion encoding
Spatial and temporal gradient predictors
🔎 Similar Papers
No similar papers found.