Legs Over Arms: On the Predictive Value of Lower-Body Pose for Human Trajectory Prediction from Egocentric Robot Perception

📅 2026-02-09

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This study addresses the challenge of improving human trajectory prediction accuracy for social robots operating in crowded environments by systematically evaluating the contribution of human skeletal keypoints—particularly those of the lower limbs—and associated biomechanical cues to multi-agent trajectory forecasting. It further investigates methods for extracting effective motion features from first-person panoramic visual inputs captured by the robot. The work presents the first evidence that 3D lower-limb pose plays a dominant role in trajectory prediction, while also demonstrating that 2D lower-limb keypoints extracted from monocular panoramic images retain significant predictive value. Experiments on the JRDB dataset and a newly curated panoramic social navigation dataset show that using only 3D lower-limb keypoints reduces average displacement error by 13%, with an additional 1–4% improvement when biomechanical features are incorporated, offering novel insights for the design of robotic perception systems.

Technology Category

Application Category

📝 Abstract

Predicting human trajectory is crucial for social robot navigation in crowded environments. While most existing approaches treat human as point mass, we present a study on multi-agent trajectory prediction that leverages different human skeletal features for improved forecast accuracy. In particular, we systematically evaluate the predictive utility of 2D and 3D skeletal keypoints and derived biomechanical cues as additional inputs. Through a comprehensive study on the JRDB dataset and another new dataset for social navigation with 360-degree panoramic videos, we find that focusing on lower-body 3D keypoints yields a 13% reduction in Average Displacement Error and augmenting 3D keypoint inputs with corresponding biomechanical cues provides a further 1-4% improvement. Notably, the performance gain persists when using 2D keypoint inputs extracted from equirectangular panoramic images, indicating that monocular surround vision can capture informative cues for motion forecasting. Our finding that robots can forecast human movement efficiently by watching their legs provides actionable insights for designing sensing capabilities for social robot navigation.

Problem

Research questions and friction points this paper is trying to address.

human trajectory prediction

social robot navigation

lower-body pose

egocentric perception

skeletal keypoints

Innovation

Methods, ideas, or system contributions that make the work stand out.

lower-body pose

trajectory prediction

egocentric perception