AgentWorld: An Interactive Simulation Platform for Scene Construction and Mobile Robotic Manipulation

📅 2025-08-11

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

Addressing the challenges of sim-to-real transfer and poor skill generalization for mobile robots in home environments, this paper introduces AgentWorld: an interactive simulation platform tailored for mobile manipulation. AgentWorld enables automated generation of domestic scenes—including layouts, semantically annotated assets, material properties, and physics-aware modeling—and supports dual-mode teleoperation for both wheeled and humanoid robots. It further provides an end-to-end data collection framework spanning atomic actions to multi-stage tasks. Methodologically, it innovatively integrates behavior cloning, action chunking with Transformers, diffusion-based policies, and vision-language-action models to produce a high-quality, diverse dataset. Experiments demonstrate that our approach significantly improves cross-domain generalization across multiple imitation learning paradigms—enhancing performance both within simulation and on real-world hardware—thereby effectively narrowing the sim-to-real gap for mobile manipulation in household settings.

Technology Category

Application Category

📝 Abstract

We introduce AgentWorld, an interactive simulation platform for developing household mobile manipulation capabilities. Our platform combines automated scene construction that encompasses layout generation, semantic asset placement, visual material configuration, and physics simulation, with a dual-mode teleoperation system supporting both wheeled bases and humanoid locomotion policies for data collection. The resulting AgentWorld Dataset captures diverse tasks ranging from primitive actions (pick-and-place, push-pull, etc.) to multistage activities (serve drinks, heat up food, etc.) across living rooms, bedrooms, and kitchens. Through extensive benchmarking of imitation learning methods including behavior cloning, action chunking transformers, diffusion policies, and vision-language-action models, we demonstrate the dataset's effectiveness for sim-to-real transfer. The integrated system provides a comprehensive solution for scalable robotic skill acquisition in complex home environments, bridging the gap between simulation-based training and real-world deployment. The code, datasets will be available at https://yizhengzhang1.github.io/agent_world/

Problem

Research questions and friction points this paper is trying to address.

Develop interactive simulation for household mobile manipulation

Combine scene construction with dual-mode teleoperation for data collection

Bridge gap between simulation training and real-world robotic deployment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated scene construction with layout generation

Dual-mode teleoperation for wheeled and humanoid robots

Benchmarking imitation learning for sim-to-real transfer

🔎 Similar Papers

No similar papers found.