🤖 AI Summary
Existing virtual environments for AI agent research treat physical task solving and social simulation as disjoint domains: the former neglects social dynamics, while the latter lacks physical grounding. This work introduces the first heterogeneous multi-agent virtual environment that unifies physical manipulation and social interaction, enabling LLM-driven agents to collaboratively execute embodied physical actions and socially situated behaviors within realistic spatial settings. Our core contributions are: (1) bidirectional dynamic anchoring between physical environment states and social behaviors; (2) a scalable, occupant-level simulation paradigm tailored for architectural human factors design; and (3) a unified framework integrating a spatially aware world model, resource competition modeling, a social behavior state machine, and a physics-based action engine. Experiments in office scenarios demonstrate significant effects of spatial layout, collaboration mechanisms, and resource constraints on collective behavior—establishing a new benchmark for AI-driven building performance simulation.
📝 Abstract
Virtual environments are essential to AI agent research. Existing environments for LLM agent research typically focus on either physical task solving or social simulation, with the former oversimplifying agent individuality and social dynamics, and the latter lacking physical grounding of social behaviors. We introduce IndoorWorld, a heterogeneous multi-agent environment that tightly integrates physical and social dynamics. By introducing novel challenges for LLM-driven agents in orchestrating social dynamics to influence physical environments and anchoring social interactions within world states, IndoorWorld opens up possibilities of LLM-based building occupant simulation for architectural design. We demonstrate the potential with a series of experiments within an office setting to examine the impact of multi-agent collaboration, resource competition, and spatial layout on agent behavior.