RoCo Challenge at AAAI 2026: Benchmarking Robotic Collaborative Manipulation for Assembly Towards Industrial Automation

📅 2026-03-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of high-precision, long-horizon bimanual collaborative assembly in industrial settings by introducing RoCo, a benchmark for planetary gearbox assembly. It establishes, for the first time, a two-stage evaluation framework that integrates simulation and real-world environments. The benchmark incorporates fine-grained task-phase scoring and a curriculum-based data strategy for fault recovery, supported by a custom Isaac Sim data collection system, a dual-arm robotic platform, and a high-quality teleoperated dataset enabling multi-task, long-horizon learning. Evaluated through a competition involving over 60 teams, the benchmark validates the effectiveness of a dual-model long-horizon learning framework, with methods such as ARC-VLA and RoboCola significantly improving assembly success rates and robustness, thereby advancing the practical deployment of embodied intelligence in industrial collaborative assembly.

Technology Category

Application Category

📝 Abstract
Embodied Artificial Intelligence (EAI) is rapidly developing, gradually subverting previous autonomous systems' paradigms from isolated perception to integrated, continuous action. This transition is highly significant for industrial robotic manipulation, promising to free human workers from repetitive, dangerous daily labor. To benchmark and advance this capability, we introduce the Robotic Collaborative Assembly Assistance (RoCo) Challenge with a dataset towards simulation and real-world assembly manipulation. Set against the backdrop of human-centered manufacturing, this challenge focuses on a high-precision planetary gearbox assembly task, a demanding yet highly representative operation in modern industry. Built upon a self-developed data collection, training, and evaluation system in Isaac Sim, and utilizing a dual-arm robot for real-world deployment, the challenge operates in two phases. The Simulation Round defines fine-grained task phases for step-wise scoring to handle the long-horizon nature of the assembly. The Real-World Round mirrors this evaluation with physical gearbox components and high-quality teleoperated datasets. The core tasks require assembling an epicyclic gearbox from scratch, including mounting three planet gears, a sun gear, and a ring gear. Attracting over 60 teams and 170+ participants from more than 10 countries, the challenge yielded highly effective solutions, most notably ARC-VLA and RoboCola. Results demonstrate that a dual-model framework for long-horizon multi-task learning is highly effective, and the strategic utilization of recovery-from-failure curriculum data is a critical insight for successful deployment. This report outlines the competition setup, evaluation approach, key findings, and future directions for industrial EAI. Our dataset, CAD files, code, and evaluation results can be found at: https://rocochallenge.github.io/RoCo2026/.
Problem

Research questions and friction points this paper is trying to address.

Robotic Collaborative Manipulation
Industrial Automation
Assembly Task
Embodied Artificial Intelligence
Long-horizon Task
Innovation

Methods, ideas, or system contributions that make the work stand out.

Robotic Collaborative Manipulation
Embodied AI
Long-horizon Assembly
Failure Recovery Curriculum
Dual-arm Robot Benchmarking
🔎 Similar Papers
No similar papers found.
H
Haichao Liu
Nanyang Technological University, Singapore
Y
Yuheng Zhou
Nanyang Technological University, Singapore
Z
Zhenyu Wu
Beijing University of Posts and Telecommunications, Beijing, China
Z
Ziheng Ji
Nanyang Technological University, Singapore
Ziyu Shan
Ziyu Shan
Nanyang Technological University
Embodied AIPoint Cloud Quality AssessmentLow-level Vision
Q
Qianzhun Wang
Nanyang Technological University, Singapore
Ruixuan Liu
Ruixuan Liu
Carnegie Mellon University
RoboticsMachine LearningOptimizationHuman-robot Interaction
Zhiyuan Yang
Zhiyuan Yang
Northeastern University
computer visionremote sensing
Yejun Gu
Yejun Gu
IHPC, A*STAR & Johns Hopkins University
PlasticityMultiscale Modeling
S
Shalman Khan
Agency for Science, Technology and Research (A*STAR), Singapore
S
Shijun Yan
Agency for Science, Technology and Research (A*STAR), Singapore
Jun Liu
Jun Liu
University of Science and Technology of China
H
Haiyue Zhu
Agency for Science, Technology and Research (A*STAR), Singapore
Changliu Liu
Changliu Liu
Associate Professor, Carnegie Mellon University
Roboticshuman-robot interactionsmotion planningoptimizationmulti-agent system
Jianfei Yang
Jianfei Yang
Assistant Professor, Director of MARS Lab, Nanyang Technological University
Physical AIEmbodied AIMultimodal AI
J
Jingbing Zhang
Agency for Science, Technology and Research (A*STAR), Singapore
Ziwei Wang
Ziwei Wang
School of Electrical and Electronic Engineering, Nanyang Technological University
embodied AIroboticscomputer vision