Social-Pose: Enhancing Trajectory Prediction with Human Body Pose

📅 2025-07-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing trajectory prediction models neglect implicit visual cues in pedestrian motion—particularly social and behavioral information encoded in human pose. To address this, we propose Social-Pose, the first systematic pose encoder integrating both 2D and 3D human pose representations into trajectory forecasting frameworks. Leveraging attention mechanisms, it explicitly models inter-agent interactions and motion intent. Social-Pose is modular and compatible with diverse backbone architectures—including LSTM, GAN, MLP, and Transformer—enabling plug-and-play integration. Evaluated on Joint Track Auto, Human3.6M, Pedestrians and Cyclists in Road Traffic, and JRDB, it achieves substantial improvements in prediction accuracy, reducing average ADE by 12.7%–23.4%. The encoder demonstrates robustness to pose estimation noise, strong cross-scenario generalization, and practical efficacy validated in real-world robot navigation tasks.

Technology Category

Application Category

📝 Abstract
Accurate human trajectory prediction is one of the most crucial tasks for autonomous driving, ensuring its safety. Yet, existing models often fail to fully leverage the visual cues that humans subconsciously communicate when navigating the space. In this work, we study the benefits of predicting human trajectories using human body poses instead of solely their Cartesian space locations in time. We propose `Social-pose', an attention-based pose encoder that effectively captures the poses of all humans in a scene and their social relations. Our method can be integrated into various trajectory prediction architectures. We have conducted extensive experiments on state-of-the-art models (based on LSTM, GAN, MLP, and Transformer), and showed improvements over all of them on synthetic (Joint Track Auto) and real (Human3.6M, Pedestrians and Cyclists in Road Traffic, and JRDB) datasets. We also explored the advantages of using 2D versus 3D poses, as well as the effect of noisy poses and the application of our pose-based predictor in robot navigation scenarios.
Problem

Research questions and friction points this paper is trying to address.

Improving human trajectory prediction using body poses
Enhancing autonomous driving safety with social pose cues
Integrating pose-based prediction into diverse model architectures
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses human body poses for trajectory prediction
Integrates attention-based pose encoder
Compatible with multiple prediction architectures
🔎 Similar Papers
No similar papers found.
Y
Yang Gao
Visual Intelligence for Transportation (VITA) lab, École Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland
Saeed Saadatnejad
Saeed Saadatnejad
PhD, EPFL
Machine LearningComputer Vision
Alexandre Alahi
Alexandre Alahi
Professor, EPFL
Computer VisionTransportationAutonomous drivingIntelligent Transportation SystemsAI