Iterative Compositional Data Generation for Robot Control

📅 2025-12-11

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

Data collection for robotic manipulation tasks involving combinations of multiple objects, robots, and environments is costly and yields poor generalization. Method: We propose a semantic-composable Diffusion Transformer that decouples state transitions into four semantic components—robot, object, obstacle, and goal—and models their interactions via attention mechanisms. We further introduce an iterative self-improving synthetic data paradigm: zero-shot generation → offline RL validation (using CQL/BCQ) → feedback-augmented knowledge distillation. Contribution/Results: This work establishes the first interpretable and generalizable semantic compositional representation for robot control. It achieves near 100% zero-shot policy success on unseen task compositions, significantly outperforming monolithic models and hand-coded compositional baselines. The synthesized data directly enables end-to-end policy learning without real-world fine-tuning.

Technology Category

Application Category

📝 Abstract

Collecting robotic manipulation data is expensive, making it impractical to acquire demonstrations for the combinatorially large space of tasks that arise in multi-object, multi-robot, and multi-environment settings. While recent generative models can synthesize useful data for individual tasks, they do not exploit the compositional structure of robotic domains and struggle to generalize to unseen task combinations. We propose a semantic compositional diffusion transformer that factorizes transitions into robot-, object-, obstacle-, and objective-specific components and learns their interactions through attention. Once trained on a limited subset of tasks, we show that our model can zero-shot generate high-quality transitions from which we can learn control policies for unseen task combinations. Then, we introduce an iterative self-improvement procedure in which synthetic data is validated via offline reinforcement learning and incorporated into subsequent training rounds. Our approach substantially improves zero-shot performance over monolithic and hard-coded compositional baselines, ultimately solving nearly all held-out tasks and demonstrating the emergence of meaningful compositional structure in the learned representations.

Problem

Research questions and friction points this paper is trying to address.

Generates robotic transitions for unseen task combinations

Overcomes data scarcity in multi-object multi-robot settings

Enables zero-shot policy learning via compositional synthesis

Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic compositional diffusion transformer factorizes transitions

Zero-shot generation of high-quality transitions for unseen tasks

Iterative self-improvement with offline reinforcement learning validation

🔎 Similar Papers

No similar papers found.