π€ AI Summary
This work proposes a structured multi-task variational Gaussian process framework for real-time, uncertainty-calibrated whole-body human motion prediction to enhance human-robot collaboration safety. By leveraging joint-wise factorization and continuous 6D rotation representations, the method enables kinematically consistent and scalable probabilistic modeling. It combines a MatΓ©rn 3/2 + Linear kernel with sparse inducing point approximations, achieving high predictive accuracy and interpretable uncertainty estimates while drastically reducing model parameters to only 0.24β0.35 million. Evaluated on Human3.6M, the approach reduces KDE negative log-likelihood by up to 50%, attains a CRPS of 0.021 meters, and achieves mean absolute error comparable to or better than state-of-the-art deep learning methods, demonstrating its suitability for real-time deployment.
π Abstract
Accurate human motion prediction with well-calibrated uncertainty is critical for safe human-robot collaboration (HRC), where robots must anticipate and react to human movements in real time. We propose a structured multitask variational Gaussian Process (GP) framework for full-body human motion prediction that captures temporal correlations and leverages joint-dimension-level factorization for scalability, while using a continuous 6D rotation representation to preserve kinematic consistency. Evaluated on Human3.6M (H3.6M), our model achieves up to 50 lower kernel density estimate negative log-likelihood (KDE NLL) than strong baselines, a mean continuous ranked probability score (CRPS) of 0.021 m, and deterministic mean angle error (MAE) that is 3-18% higher than competitive deep learning methods. Empirical coverage analysis shows that the fraction of ground-truth outcomes contained within predicted confidence intervals gradually decreases with horizon, remaining conservative for lower-confidence intervals and near-nominal for higher-confidence intervals, with only modest calibration drift at longer horizons. Despite its probabilistic formulation, our model requires only 0.24-0.35 M parameters, roughly eight times fewer than comparable approaches, and exhibits modest inference times, indicating suitability for real-time deployment. Extensive ablation studies further validated the choice of 6D rotation representation and Matern 3/2 + Linear kernel, and guided the selection of the number of inducing points and latent dimensionality. These results demonstrate that scalable GP-based models can deliver competitive accuracy together with reliable and interpretable uncertainty estimates for downstream robotics tasks such as motion planning and collision avoidance.