Topology-Agnostic Animal Motion Generation from Text Prompt

📅 2025-12-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing motion generation methods are constrained by the scarcity of heterogeneous animal motion data and modeling bottlenecks arising from fixed skeletal templates. To address these limitations, we propose the first text-driven, topology-agnostic animal motion generation framework. Our approach comprises three key components: (1) constructing OmniZoo—a large-scale, multi-species motion dataset encompassing 140 animal species and 33,000 motion sequences; (2) designing a topology-aware skeletal embedding module that jointly encodes arbitrary skeletal geometries and textual semantics into a unified representation space; and (3) integrating autoregressive sequence modeling, multimodal alignment, and joint representation learning. The resulting method generates physically plausible, temporally coherent, and semantically accurate motions. It further enables cross-species motion style transfer and demonstrates strong generalization to unseen skeletal topologies.

Technology Category

Application Category

📝 Abstract
Motion generation is fundamental to computer animation and widely used across entertainment, robotics, and virtual environments. While recent methods achieve impressive results, most rely on fixed skeletal templates, which prevent them from generalizing to skeletons with different or perturbed topologies. We address the core limitation of current motion generation methods - the combined lack of large-scale heterogeneous animal motion data and unified generative frameworks capable of jointly modeling arbitrary skeletal topologies and textual conditions. To this end, we introduce OmniZoo, a large-scale animal motion dataset spanning 140 species and 32,979 sequences, enriched with multimodal annotations. Building on OmniZoo, we propose a generalized autoregressive motion generation framework capable of producing text-driven motions for arbitrary skeletal topologies. Central to our model is a Topology-aware Skeleton Embedding Module that encodes geometric and structural properties of any skeleton into a shared token space, enabling seamless fusion with textual semantics. Given a text prompt and a target skeleton, our method generates temporally coherent, physically plausible, and semantically aligned motions, and further enables cross-species motion style transfer.
Problem

Research questions and friction points this paper is trying to address.

Generates animal motions from text prompts for arbitrary skeletal topologies
Addresses lack of large-scale heterogeneous animal motion data and unified frameworks
Enables cross-species motion style transfer and text-driven generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Topology-aware skeleton embedding for arbitrary skeletons
Autoregressive framework for text-driven motion generation
Cross-species motion style transfer capability
🔎 Similar Papers
No similar papers found.
K
Keyi Chen
Tsinghua Shenzhen International Graduate School, China
Mingze Sun
Mingze Sun
Tsinghua University
computer visiongraphics
Z
Zhenyu Liu
Tsinghua Shenzhen International Graduate School, China
Z
Zhangquan Chen
Tsinghua Shenzhen International Graduate School, China
Ruqi Huang
Ruqi Huang
Tsinghua Shenzhen International Graduate School
3D Computer VisionShape AnalysisGeometry Processing