🤖 AI Summary
This work addresses the domain gap in sim-to-real transfer by proposing an end-to-end transfer framework that, for the first time, integrates high-fidelity environment rendering, adversarial trajectory generation, and simulation-reality co-training into a unified tripartite architecture to learn domain-invariant policy representations. This approach systematically reduces the distributional discrepancy between simulated and real environments, substantially enhancing the generalization and robustness of policies in real-world settings. Experimental results demonstrate that, over 400 real-world tasks, the ACT and π₀ models achieve success rates of 80% and 95%, respectively; furthermore, incorporating adversarial trajectory training improves task completion rates under perturbed conditions by 35%.
📝 Abstract
Scaling data volume and diversity is critical for generalizing embodied intelligence. While synthetic data generation offers a scalable alternative to expensive physical data acquisition, transferring robotic manipulation policies from simulation to the real world (sim-to-real) remains a formidable challenge due to the domain gap. This paper presents HyperSim, a holistic framework spanning from synthetic data generation to policy training and seamless real-world deployment. To systematically bridge the sim-to-real gap, HyperSim is realized through three core pillars: high-fidelity environment synthesis, adversarial trajectory generation, and sim-and-real co-training. Collectively, these modules address domain discrepancies by enhancing visual fidelity, expanding data coverage, and enforcing domain-invariant representations. We rigorously validate HyperSim through a large-scale empirical study involving 400 real-world task executions across two representative manipulation models. Assessed across three fine-grained metrics, our complete pipeline achieves remarkable sim-to-real success rates of 80% and 95% with ACT and π_{0}, respectively. Furthermore, policies trained on our adversarial trajectories exhibit significantly enhanced robustness against dynamic uncertainties, achieving a 35% higher completion rate under physical perturbations.