SteadyTray: Learning Object Balancing Tasks in Humanoid Tray Transport via Residual Reinforcement Learning

📅 2026-03-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of unstable unsecured objects on a tray during dynamic bipedal walking of humanoid robots, caused by gait-induced oscillations. To this end, the authors propose ReST-RL, a hierarchical reinforcement learning architecture that decouples a low-level robust walking policy from a high-level residual perturbation suppression module, enabling high-precision tray balancing. This approach achieves smooth object transport without compromising bipedal stability and supports zero-shot transfer to real hardware. Experimental results demonstrate a 96.9% success rate in variable-speed trajectory tracking and 74.5% robustness against external disturbances in simulation. Furthermore, the method is successfully deployed in a zero-shot manner on the Unitree G1 humanoid robot, validating its practical efficacy.

Technology Category

Application Category

📝 Abstract
Stabilizing unsecured payloads against the inherent oscillations of dynamic bipedal locomotion remains a critical engineering bottleneck for humanoids in unstructured environments. To solve this, we introduce ReST-RL, a hierarchical reinforcement learning architecture that explicitly decouples locomotion from payload stabilization, evaluated via the SteadyTray benchmark. Rather than relying on monolithic end-to-end learning, our framework integrates a robust base locomotion policy with a dynamic residual module engineered to actively cancel gait-induced perturbations at the end-effector. This architectural separation ensures steady tray transport without degrading the underlying bipedal stability. In simulation, the residual design significantly outperforms end-to-end baselines in gait smoothness and orientation accuracy, achieving a 96.9% success rate in variable velocity tracking and 74.5% robustness against external force disturbances. Successfully deployed on the Unitree G1 humanoid hardware, this modular approach demonstrates highly reliable zero-shot sim-to-real generalization across various objects and external force disturbances.
Problem

Research questions and friction points this paper is trying to address.

humanoid
payload stabilization
bipedal locomotion
object balancing
dynamic perturbations
Innovation

Methods, ideas, or system contributions that make the work stand out.

residual reinforcement learning
hierarchical control
payload stabilization
bipedal locomotion
sim-to-real transfer
🔎 Similar Papers
No similar papers found.