ParkourFormer: Integrating Predictive Supervision and Sequence Modeling into Parkour Locomotion

📅 2026-05-25

📈 Citations: 0

✨ Influential: 0

career value

168K/year

🤖 AI Summary

Existing humanoid robot parkour strategies lack explicit modeling of future body states, limiting their ability to handle contact transitions and dynamic prediction on complex terrains. This work formulates parkour as a sequential decision-making problem conditioned on future states and proposes an end-to-end reinforcement learning policy based on a Transformer architecture. The approach integrates historical proprioceptive and motion trajectories via cross-attention mechanisms and incorporates a lightweight prediction head trained with supervised learning to forecast short-term future proprioceptive states. By unifying explicit future state prediction with temporal modeling within a single framework, the method achieves highly robust control across diverse challenging terrains under a unified policy. Evaluated in both simulation and on a physical robot, it attains an average terrain traversal success rate of 93.85%, representing a maximum improvement of 42.73% over MLP, MoE-MLP, and standard Transformer baselines.

📝 Abstract

Humanoid parkour requires locomotion policies to coordinate whole-body dynamics across rapidly changing terrains such as stairs, gaps, slopes, and obstacles. Existing reinforcement learning policies are largely reactive, mapping observations directly to actions without explicitly modeling future body states. Such modeling becomes critical in agile locomotion tasks where successful motion execution depends strongly on anticipating upcoming contact transitions and body dynamics.We present ParkourFormer, a Transformer-based sequence modeling framework that reformulates humanoid locomotion as a future-conditioned decision-making problem. The current robot state queries historical sensorimotor trajectories through cross-attention, while a lightweight prediction head forecasts short-horizon future proprioceptive states. The predicted future states, trained with supervised signals, are fused with temporal features to generate actions, enabling the policy to jointly reason over motion history and anticipated future dynamics. We evaluate ParkourFormer on a diverse multi-terrain humanoid parkour benchmark including stairs, gaps, slopes, rough terrain, and obstacle traversal. Experiments in simulation and on a real humanoid robot show that ParkourFormer achieves a 93.85% average traversal success rate on highly challenging terrains, with improvements of up to 42.73% over strong MLP, MoE-based MLP, and vanilla Transformer baselines, while maintaining a single unified policy across all terrain types. These results demonstrate that explicit future-state modeling significantly improves robustness and generalization for agile whole-body locomotion.

Problem

Research questions and friction points this paper is trying to address.

humanoid parkour

locomotion policy

future-state modeling

whole-body dynamics

terrain adaptation

Innovation

Methods, ideas, or system contributions that make the work stand out.

sequence modeling

future-state prediction

Transformer-based policy