RoboEvolve: Co-Evolving Planner-Simulator for Robotic Manipulation with Limited Data

📅 2026-05-13

📈 Citations: 0

✨ Influential: 0

career value

258K/year

🤖 AI Summary

This work addresses the scarcity of task-aligned physical interaction data in robotic manipulation, as well as the semantic-spatial misalignment in vision-language models and physical hallucinations in video generation models. To overcome these challenges, the authors propose a closed-loop framework featuring co-evolution between a planner and a simulator. Requiring only 500 unlabeled seed images, the approach employs a cognition-inspired diurnal–nocturnal two-phase mechanism—exploration during the “day” and optimization at “night”—along with an autonomous progressive curriculum to naturally scale from atomic actions to complex tasks. The method integrates multi-granularity semantic control rewards, near-failure sample mining, and continual learning strategies. Experiments demonstrate a 30-percentage-point improvement in base planner performance and an average 48% increase in simulator success rates, surpassing fully supervised baselines without catastrophic forgetting.

📝 Abstract

The scalability of robotic manipulation is fundamentally bottlenecked by the scarcity of task-aligned physical interaction data. While vision-language models (VLMs) and video generation models (VGMs) hold promise for autonomous data synthesis, they suffer from semantic-spatial misalignment and physical hallucinations, respectively. To bridge this gap, we introduce RoboEvolve, a novel framework that couples a VLM planner and a VGM simulator into a mutually reinforcing co-evolutionary loop. Operating purely on unlabeled seed images, RoboEvolve leverages a cognitive-inspired dual-phase mechanism: (i) daytime exploration fosters physically grounded behavioral discovery through a semantic-controlled multi-granular reward, and (ii) nighttime consolidation mines "near-miss" failures to stabilize policy optimization. Guided by an autonomous progressive curriculum, the system naturally scales from simple atomic actions to complex tasks. Extensive experiments demonstrate that RoboEvolve (I) achieves superior effectiveness, elevating base planners by 30 absolute points and amplifying simulator success by 48% on average; (II) exhibits extreme data efficiency, surpassing fully supervised baselines with merely 500 unlabeled seeds--a 50x reduction; and (III) demonstrates robust continual learning without catastrophic forgetting.

Problem

Research questions and friction points this paper is trying to address.

robotic manipulation

data scarcity

semantic-spatial misalignment

physical hallucinations

task-aligned data

Innovation

Methods, ideas, or system contributions that make the work stand out.

co-evolutionary framework

vision-language model

video generation model