🤖 AI Summary
To address the challenges of unifying modeling, evaluation, and reproducibility for imitation learning in robotic manipulation across simulation and real-world environments, this paper introduces the first integrated, general-purpose, scalable, and reproducible open framework. The framework uniformly supports cross-environment (Sim2Real) data collection, multimodal policy training—integrating vision, language, and action modalities—and standardized evaluation, via a modular pipeline compatible with diverse robot hardware and task configurations. Its core contributions are: (1) a cross-platform benchmarking suite covering multi-task, multi-robot, and multimodal policies; (2) seamless co-processing and evaluation alignment between simulated and real-world data; and (3) substantially improved experimental reproducibility and fairness in algorithmic comparison. Extensive validation on multiple physical robot platforms demonstrates its efficacy in efficient policy learning and Sim2Real transfer.
📝 Abstract
RoboManipBaselines is an open framework for robot imitation learning that unifies data collection, training, and evaluation across simulation and real robots. We introduce it as a platform enabling systematic benchmarking of diverse tasks, robots, and multimodal policies with emphasis on integration, generality, extensibility, and reproducibility.