TIMotion: Temporal and Interactive Framework for Efficient Human-Human Motion Generation

📅 2024-08-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing two-person motion generation methods suffer from modeling deficiencies: naive concatenation of individual skeletons ignores interaction causality, while separate modeling fails to capture dynamic role evolution—leading to suboptimal performance and parameter redundancy. This paper proposes a two-stage framework comprising temporal modeling and interaction fusion, introducing three novel mechanisms: causal interaction injection, role-evolution scanning, and local pattern enhancement. Our method builds an efficient Transformer architecture grounded in causal sequence modeling, dynamic role-aware attention, and lightweight spatiotemporal convolutions. Evaluated on InterHuman and InterX, it significantly outperforms state-of-the-art approaches: improving motion plausibility by 12.6%, reducing model parameters by 37%, and generating motions that are more temporally coherent, physically plausible, and socially semantically consistent.

Technology Category

Application Category

📝 Abstract
Human-human motion generation is essential for understanding humans as social beings. Current methods fall into two main categories: single-person-based methods and separate modeling-based methods. To delve into this field, we abstract the overall generation process into a general framework MetaMotion, which consists of two phases: temporal modeling and interaction mixing. For temporal modeling, the single-person-based methods concatenate two people into a single one directly, while the separate modeling-based methods skip the modeling of interaction sequences. The inadequate modeling described above resulted in sub-optimal performance and redundant model parameters. In this paper, we introduce TIMotion (Temporal and Interactive Modeling), an efficient and effective framework for human-human motion generation. Specifically, we first propose Causal Interactive Injection to model two separate sequences as a causal sequence leveraging the temporal and causal properties. Then we present Role-Evolving Scanning to adjust to the change in the active and passive roles throughout the interaction. Finally, to generate smoother and more rational motion, we design Localized Pattern Amplification to capture short-term motion patterns. Extensive experiments on InterHuman and InterX demonstrate that our method achieves superior performance. The project code will be released upon acceptance. Project page: https://aigc-explorer.github.io/TIMotion-page/
Problem

Research questions and friction points this paper is trying to address.

Efficient human-human motion generation framework
Addresses limitations in temporal and interaction modeling
Improves motion smoothness and rationality
Innovation

Methods, ideas, or system contributions that make the work stand out.

Causal Interactive Injection for sequence modeling
Role-Evolving Scanning for role adaptation
Localized Pattern Amplification for smooth motion
🔎 Similar Papers
No similar papers found.
Y
Yabiao Wang
Zhejiang University, Youtu Lab, Tencent
S
Shuo Wang
Youtu Lab, Tencent
J
Jiangning Zhang
Youtu Lab, Tencent
Ke Fan
Ke Fan
Fudan University
Machine LearningDeep Learning
Jiafu Wu
Jiafu Wu
Tencent Youtu Lab
AIGCLLM
Z
Zhucun Xue
Y
Yong Liu
Zhejiang University