Bridging the Sim-to-Real Gap for Athletic Loco-Manipulation

📅 2025-02-15

📈 Citations: 0

✨ Influential: 0

career value

228K/year

🤖 AI Summary

Addressing the challenges of sim-to-real transfer, reward hacking induced by task-specific rewards (e.g., “maximize throwing distance”), and undirected exploration in robot locomotion and athletic skill learning, this paper proposes the Unsupervised Actuator Network (UAN) and a two-stage training paradigm. First, policy pretraining is conducted via reinforcement learning using task-oriented rewards; second, fine-tuning is guided by reference trajectories—eliminating reliance on precise trajectory tracking. Crucially, UAN models actuator dynamics without torque sensing, substantially mitigating sim-to-real domain shift. Evaluated on real hardware, the approach enables high-fidelity execution of athletic behaviors—including weightlifting, throwing, and dragging—with significantly improved task success rates and markedly enhanced cross-domain generalization.

Technology Category

Application Category

📝 Abstract

Achieving athletic loco-manipulation on robots requires moving beyond traditional tracking rewards - which simply guide the robot along a reference trajectory - to task rewards that drive truly dynamic, goal-oriented behaviors. Commands such as"throw the ball as far as you can"or"lift the weight as quickly as possible"compel the robot to exhibit the agility and power inherent in athletic performance. However, training solely with task rewards introduces two major challenges: these rewards are prone to exploitation (reward hacking), and the exploration process can lack sufficient direction. To address these issues, we propose a two-stage training pipeline. First, we introduce the Unsupervised Actuator Net (UAN), which leverages real-world data to bridge the sim-to-real gap for complex actuation mechanisms without requiring access to torque sensing. UAN mitigates reward hacking by ensuring that the learned behaviors remain robust and transferable. Second, we use a pre-training and fine-tuning strategy that leverages reference trajectories as initial hints to guide exploration. With these innovations, our robot athlete learns to lift, throw, and drag with remarkable fidelity from simulation to reality.

Problem

Research questions and friction points this paper is trying to address.

Bridging sim-to-real gap for robot athletic tasks

Addressing reward hacking in dynamic robot behaviors

Enhancing exploration with pre-training and fine-tuning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unsupervised Actuator Net usage

Two-stage training pipeline

Pre-training and fine-tuning strategy

🔎 Similar Papers

No similar papers found.