CoinRobot: Generalized End-to-end Robotic Learning for Physical Intelligence

📅 2025-03-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address key bottlenecks—including poor cross-platform generalization, hardware-specific coupling, and lack of objective real-world evaluation—this paper proposes an end-to-end physical intelligence learning framework. Methodologically, it introduces: (1) a hardware-agnostic unified architecture with modular sensor-action interfaces, enabling seamless deployment across industrial arms, collaborative robots, and novel morphologies; (2) the first diffusion-model-based end-to-end training paradigm tailored for physical intelligence, integrating multi-task learning and a lightweight Transformer backbone; and (3) the first objective, reproducible real-world manipulation benchmark. Evaluated on seven diverse tasks, the framework outperforms LeRobot across all metrics; maintains robustness across seven heterogeneous robotic platforms and dynamic environments; achieves 40% higher deployment efficiency; and improves generalization success rate by 28.6%.

Technology Category

Application Category

📝 Abstract
Physical intelligence holds immense promise for advancing embodied intelligence, enabling robots to acquire complex behaviors from demonstrations. However, achieving generalization and transfer across diverse robotic platforms and environments requires careful design of model architectures, training strategies, and data diversity. Meanwhile existing systems often struggle with scalability, adaptability to heterogeneous hardware, and objective evaluation in real-world settings. We present a generalized end-to-end robotic learning framework designed to bridge this gap. Our framework introduces a unified architecture that supports cross-platform adaptability, enabling seamless deployment across industrial-grade robots, collaborative arms, and novel embodiments without task-specific modifications. By integrating multi-task learning with streamlined network designs, it achieves more robust performance than conventional approaches, while maintaining compatibility with varying sensor configurations and action spaces. We validate our framework through extensive experiments on seven manipulation tasks. Notably, Diffusion-based models trained in our framework demonstrated superior performance and generalizability compared to the LeRobot framework, achieving performance improvements across diverse robotic platforms and environmental conditions.
Problem

Research questions and friction points this paper is trying to address.

Achieving generalization across diverse robotic platforms and environments.
Addressing scalability and adaptability to heterogeneous hardware.
Improving objective evaluation in real-world robotic settings.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified architecture for cross-platform adaptability
Multi-task learning with streamlined network designs
Diffusion-based models for superior performance
🔎 Similar Papers
No similar papers found.
Y
Yu Zhao
ZhiCheng AI
H
Huxian Liu
ZhiCheng AI
X
Xiang Chen
Peking University
J
Jiankai Sun
Stanford University
Jiahuan Yan
Jiahuan Yan
ZhiCheng AI
Luhui Hu
Luhui Hu
Aurorain AI, ex-Meta, Microsoft, Amazon
AI engineeringdata cloudfoundation models