🤖 AI Summary
Existing motion generation methods often neglect physical constraints, leading to non-physical artifacts—such as interpenetration, sliding, and floating—in multi-agent interaction scenarios. To address this, we propose an end-to-end physics-aware generation framework that establishes a holistic physics mapping mechanism spanning modeling, training, and post-processing. Our approach jointly optimizes motion imitation and differentiable physics-constrained projection within PyBullet/DeepMimic simulation environments. We introduce two novel loss functions: Motion Consistency Loss and Marker-based Interaction Loss, specifically designed for multi-agent interactions. Additionally, we integrate MoCap data augmentation with physics-guided generative post-processing. Quantitative evaluation shows 3%–89% improvement across multiple physics fidelity metrics. The method effectively suppresses non-physical interaction artifacts, yielding motions that are more natural, dynamically stable, and controllable.
📝 Abstract
Driven by advancements in motion capture and generative artificial intelligence, leveraging large-scale MoCap datasets to train generative models for synthesizing diverse, realistic human motions has become a promising research direction. However, existing motion-capture techniques and generative models often neglect physical constraints, leading to artifacts such as interpenetration, sliding, and floating. These issues are exacerbated in multi-person motion generation, where complex interactions are involved. To address these limitations, we introduce physical mapping, integrated throughout the human interaction generation pipeline. Specifically, motion imitation within a physics-based simulation environment is used to project target motions into a physically valid space. The resulting motions are adjusted to adhere to real-world physics constraints while retaining their original semantic meaning. This mapping not only improves MoCap data quality but also directly informs post-processing of generated motions. Given the unique interactivity of multi-person scenarios, we propose a tailored motion representation framework. Motion Consistency (MC) and Marker-based Interaction (MI) loss functions are introduced to improve model performance. Experiments show our method achieves impressive results in generated human motion quality, with a 3%-89% improvement in physical fidelity. Project page http://yw0208.github.io/physiinter