🤖 AI Summary
Human trajectory prediction (HTP) methods commonly rely on implicit pose cues, often yielding physically implausible predictions. To address this, we propose an embodied motion modeling framework that explicitly incorporates biomechanical and physical constraints. Our approach introduces three key contributions: (1) a differentiable “motion value function” that approximates physics-based simulation while enabling gradient-based optimization; (2) an embodied motion loss and motion value filter, jointly enforcing physical feasibility in an end-to-end trainable manner; and (3) a multi-head stochastic trajectory generator integrated with a feasibility-aware filtering mechanism. Evaluated on standard benchmarks—including ETH and UCY—our method achieves significant improvements over state-of-the-art models in both prediction accuracy (e.g., ADE/FDE) and physical plausibility. All code is publicly available.
📝 Abstract
Humans can predict future human trajectories even from momentary observations by using human pose-related cues. However, previous Human Trajectory Prediction (HTP) methods leverage the pose cues implicitly, resulting in implausible predictions. To address this, we propose Locomotion Embodiment, a framework that explicitly evaluates the physical plausibility of the predicted trajectory by locomotion generation under the laws of physics. While the plausibility of locomotion is learned with an indifferentiable physics simulator, it is replaced by our differentiable Locomotion Value function to train an HTP network in a data-driven manner. In particular, our proposed Embodied Locomotion loss is beneficial for efficiently training a stochastic HTP network using multiple heads. Furthermore, the Locomotion Value filter is proposed to filter out implausible trajectories at inference. Experiments demonstrate that our method enhances even the state-of-the-art HTP methods across diverse datasets and problem settings. Our code is available at: https://github.com/ImIntheMiddle/EmLoco.