AnyTop: Character Animation Diffusion with Any Topology

📅 2025-02-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Action generation for arbitrary skeletal topologies has long been hindered by data scarcity and structural irregularity. To address this, we propose the first topology-agnostic diffusion framework: skeletal topology is encoded as attention bias, while textual joint descriptions establish cross-skeleton semantic alignment; a topology-aware Transformer denoising network is designed, integrating latent-space semantic disentanglement with text-enhanced joint representation learning. Our method achieves strong generalization to unseen skeletons using only three samples per topology. It excels in downstream tasks—including joint correspondence, temporal segmentation, and motion editing—demonstrating superior performance over prior approaches. The learned latent space exhibits high interpretability and fine-grained controllability, enabling precise, topology-independent motion synthesis. Overall, our framework significantly advances the generalizability and practical applicability of skeleton-agnostic action generation.

Technology Category

Application Category

📝 Abstract
Generating motion for arbitrary skeletons is a longstanding challenge in computer graphics, remaining largely unexplored due to the scarcity of diverse datasets and the irregular nature of the data. In this work, we introduce AnyTop, a diffusion model that generates motions for diverse characters with distinct motion dynamics, using only their skeletal structure as input. Our work features a transformer-based denoising network, tailored for arbitrary skeleton learning, integrating topology information into the traditional attention mechanism. Additionally, by incorporating textual joint descriptions into the latent feature representation, AnyTop learns semantic correspondences between joints across diverse skeletons. Our evaluation demonstrates that AnyTop generalizes well, even with as few as three training examples per topology, and can produce motions for unseen skeletons as well. Furthermore, our model's latent space is highly informative, enabling downstream tasks such as joint correspondence, temporal segmentation and motion editing. Our webpage, https://anytop2025.github.io/Anytop-page, includes links to videos and code.
Problem

Research questions and friction points this paper is trying to address.

Generates motion for diverse character skeletons
Uses skeletal structure as primary input
Integrates topology into attention mechanism
Innovation

Methods, ideas, or system contributions that make the work stand out.

Diffusion model for motion generation
Transformer-based denoising network
Textual joint descriptions integration
🔎 Similar Papers
No similar papers found.