Empirical Analysis of Sim-and-Real Cotraining Of Diffusion Policies For Planar Pushing from Pixels

📅 2025-03-28

📈 Citations: 0

✨ Influential: 0

career value

229K/year

🤖 AI Summary

This paper addresses the domain gap in sim-to-real transfer for robotic imitation learning by proposing a co-training framework that jointly leverages simulated and real-world data to train diffusion-based policies, systematically investigating coordination mechanisms and performance limits on pixel-level planar pushing tasks. Methodologically, it integrates diffusion modeling, behavioral cloning, and binary domain probing for disentangled analysis. Key findings include: (i) moderate visual domain discrepancy enhances generalization; (ii) physical fidelity is more critical than visual fidelity; and (iii) policies autonomously learn domain-discriminative features that facilitate positive transfer. Evaluated across 800+ real-world trials, the approach achieves significant performance gains. Simulated data yield diminishing returns beyond a saturation point, whereas real-data quantity fundamentally bounds peak performance. We benchmark over 40 real-world and 200+ simulated policies, establishing a reproducible empirical framework and principled design guidelines for sim-to-real co-training.

Technology Category

Application Category

📝 Abstract

In imitation learning for robotics, cotraining with demonstration data generated both in simulation and on real hardware has emerged as a powerful recipe to overcome the sim2real gap. This work seeks to elucidate basic principles of this sim-and-real cotraining to help inform simulation design, sim-and-real dataset creation, and policy training. Focusing narrowly on the canonical task of planar pushing from camera inputs enabled us to be thorough in our study. These experiments confirm that cotraining with simulated data emph{can} dramatically improve performance in real, especially when real data is limited. Performance gains scale with simulated data, but eventually plateau; real-world data increases this performance ceiling. The results also suggest that reducing the domain gap in physics may be more important than visual fidelity for non-prehensile manipulation tasks. Perhaps surprisingly, having some visual domain gap actually helps the cotrained policy -- binary probes reveal that high-performing policies learn to distinguish simulated domains from real. We conclude by investigating this nuance and mechanisms that facilitate positive transfer between sim-and-real. In total, our experiments span over 40 real-world policies (evaluated on 800+ trials) and 200 simulated policies (evaluated on 40,000+ trials).

Problem

Research questions and friction points this paper is trying to address.

Investigates sim-and-real cotraining to bridge the sim2real gap in robotics

Examines impact of simulated and real data on policy performance

Analyzes domain gap effects in physics and visual fidelity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Sim-and-real cotraining bridges sim2real gap

Physics domain gap reduction boosts performance

Visual domain gap aids policy distinction

🔎 Similar Papers

Learning Goal-Directed Object Pushing in Cluttered Scenes with Location-Based Attention

2024-03-26arXiv.orgCitations: 0