🤖 AI Summary
Humanoid robots struggle to learn high-quality, whole-body policies for complex, multi-limb object interaction tasks using conventional reinforcement learning (RL), due to poor convergence and brittle policies.
Method: We propose a diffusion-prior-guided whole-body control framework. A diffusion model is pre-trained on human motion data to serve as a natural, robust, and transferable motion prior, which is then embedded into the RL policy optimization process. The resulting policy is trained entirely in simulation and deployed directly onto a Unitree G1 robot without domain randomization or fine-tuning.
Contribution/Results: Our approach significantly improves policy convergence speed and action quality. It enables efficient sim-to-real transfer across diverse bimanual and whole-body manipulation tasks—including object lifting, repositioning, and coordinated arm-leg interactions—demonstrating the effectiveness and generalizability of diffusion priors for embodied intelligent control.
📝 Abstract
We introduce DreamControl, a novel methodology for learning autonomous whole-body humanoid skills. DreamControl leverages the strengths of diffusion models and Reinforcement Learning (RL): our core innovation is the use of a diffusion prior trained on human motion data, which subsequently guides an RL policy in simulation to complete specific tasks of interest (e.g., opening a drawer or picking up an object). We demonstrate that this human motion-informed prior allows RL to discover solutions unattainable by direct RL, and that diffusion models inherently promote natural looking motions, aiding in sim-to-real transfer. We validate DreamControl's effectiveness on a Unitree G1 robot across a diverse set of challenging tasks involving simultaneous lower and upper body control and object interaction.