Opening the Sim-to-Real Door for Humanoid Pixel-to-Action Policy Transfer

📅 2025-11-30

📈 Citations: 0

✨ Influential: 0

career value

246K/year

🤖 AI Summary

This work addresses the sim-to-real transfer challenge for RGB-only vision-driven humanoid robots performing robust manipulation of diverse articulated objects (e.g., various door types) in real-world environments. We propose a two-stage reset-exploration strategy and GRPO-based policy fine-tuning in simulation. Leveraging GPU-accelerated photorealistic simulation, multi-level physical and visual randomization, and a teacher–student–bootstrapping learning framework, we achieve end-to-end, long-horizon pixel-to-full-body control policy transfer. To our knowledge, this is the first demonstration of zero-shot operation of unseen door types by a humanoid robot using only RGB input. Our method reduces task completion time by 31.7% compared to human teleoperation, validating the effectiveness and scalability of full-stack simulation training for complex, dexterous human–robot interaction tasks.

Technology Category

Application Category

📝 Abstract

Recent progress in GPU-accelerated, photorealistic simulation has opened a scalable data-generation path for robot learning, where massive physics and visual randomization allow policies to generalize beyond curated environments. Building on these advances, we develop a teacher-student-bootstrap learning framework for vision-based humanoid loco-manipulation, using articulated-object interaction as a representative high-difficulty benchmark. Our approach introduces a staged-reset exploration strategy that stabilizes long-horizon privileged-policy training, and a GRPO-based fine-tuning procedure that mitigates partial observability and improves closed-loop consistency in sim-to-real RL. Trained entirely on simulation data, the resulting policy achieves robust zero-shot performance across diverse door types and outperforms human teleoperators by up to 31.7% in task completion time under the same whole-body control stack. This represents the first humanoid sim-to-real policy capable of diverse articulated loco-manipulation using pure RGB perception.

Problem

Research questions and friction points this paper is trying to address.

Develops a teacher-student-bootstrap framework for vision-based humanoid loco-manipulation

Introduces strategies to stabilize training and improve sim-to-real transfer consistency

Achieves robust zero-shot performance on diverse doors using only RGB perception

Innovation

Methods, ideas, or system contributions that make the work stand out.

Teacher-student-bootstrap framework for vision-based humanoid loco-manipulation

Staged-reset exploration strategy stabilizes long-horizon privileged-policy training

GRPO-based fine-tuning mitigates partial observability for sim-to-real transfer

🔎 Similar Papers

I-CTRL: Imitation to Control Humanoid Robots Through Constrained Reinforcement Learning

2024-05-14arXiv.orgCitations: 2

💼 Related Jobs

Senior Research Engineer, Simulation

Nvidia

224,000 USD - 356,500 USD

US, CA, Santa Clara

Research Scientist Intern, Robotic Control Policy (PhD)