๐ค AI Summary
Existing robotic datasets and benchmarks predominantly focus on short-horizon, single-task domestic or desktop scenarios, lacking support for long-horizon, multi-task embodied intelligence research in complex commercial environmentsโe.g., supermarkets.
Method: We propose the first embodied intelligence research framework for supermarket settings, leveraging multi-agent procedural content generation (PCG) guided by textual/image inputs and real-world spatial design principles to automatically synthesize structured, scalable 3D supermarket environments. We curate a standardized 3D asset library containing 1,100+ grocery items and develop a modular agent architecture. Additionally, we establish the first dual-task benchmark for cashiering and shelf stocking, validated via cross-domain sim-to-real transfer.
Contribution/Results: This work bridges critical gaps in commercial embodied intelligence across data, environment, and evaluation. It enables efficient, scalable development and significantly improves generalization capability for real-world retail automation.
๐ Abstract
The development of embodied agents for complex commercial environments is hindered by a critical gap in existing robotics datasets and benchmarks, which primarily focus on household or tabletop settings with short-horizon tasks. To address this limitation, we introduce MarketGen, a scalable simulation platform with automatic scene generation for complex supermarket environments. MarketGen features a novel agent-based Procedural Content Generation (PCG) framework. It uniquely supports multi-modal inputs (text and reference images) and integrates real-world design principles to automatically generate complete, structured, and realistic supermarkets. We also provide an extensive and diverse 3D asset library with a total of 1100+ supermarket goods and parameterized facilities assets. Building on this generative foundation, we propose a novel benchmark for assessing supermarket agents, featuring two daily tasks in a supermarket: (1) Checkout Unloading: long-horizon tabletop tasks for cashier agents, and (2) In-Aisle Item Collection: complex mobile manipulation tasks for salesperson agents. We validate our platform and benchmark through extensive experiments, including the deployment of a modular agent system and successful sim-to-real transfer. MarketGen provides a comprehensive framework to accelerate research in embodied AI for complex commercial applications.