🤖 AI Summary
This work addresses real-time human motion understanding for human–robot interaction and mobile robot navigation. To jointly predict full-body pose dynamics and global navigation trajectories from short input sequences, we propose the first non-autoregressive unified model integrating kinematic and trajectory modeling. Our method enhances temporal modeling via motion transformation techniques and incorporates 3D pose estimation, graph attention networks (GATs) to encode skeletal topology, and a lightweight non-autoregressive Transformer architecture. We also introduce DARKO, the first human mobility dataset specifically designed for navigation-oriented behavioral analysis. Evaluated on Human3.6M, CMU-Mocap, and DARKO, our approach achieves >30 FPS inference speed while attaining state-of-the-art accuracy in both pose and trajectory prediction. This significantly improves robotic systems’ reliability in anticipating human motion intent and navigating dynamically in human-populated environments.
📝 Abstract
We introduce a unified approach to forecast the dynamics of human keypoints along with the motion trajectory based on a short sequence of input poses. While many studies address either full-body pose prediction or motion trajectory prediction, only a few attempt to merge them. We propose a motion transformation technique to simultaneously predict full-body pose and trajectory key-points in a global coordinate frame. We utilize an off-the-shelf 3D human pose estimation module, a graph attention network to encode the skeleton structure, and a compact, non-autoregressive transformer suitable for real-time motion prediction for human-robot interaction and human-aware navigation. We introduce a human navigation dataset ``DARKO'' with specific focus on navigational activities that are relevant for human-aware mobile robot navigation. We perform extensive evaluation on Human3.6M, CMU-Mocap, and our DARKO dataset. In comparison to prior work, we show that our approach is compact, real-time, and accurate in predicting human navigation motion across all datasets. Result animations, our dataset, and code will be available at https://nisarganc.github.io/UPTor-page/