SimWorld: An Open-ended Realistic Simulator for Autonomous Agents in Physical and Social Worlds

📅 2025-11-30

📈 Citations: 0

✨ Influential: 0

career value

232K/year

🤖 AI Summary

Existing simulation platforms suffer from limited environmental diversity, insufficient fidelity in physical and social rule modeling, and inadequate native support for LLM/VLM agents. Method: We propose a high-fidelity open-world simulation platform built on Unreal Engine 5, featuring a novel language-driven procedural world generation mechanism, integrated high-accuracy physics simulation and social dynamics modeling, multimodal perception, open-vocabulary action execution, and a hierarchical abstract action space. The platform supports customizable multi-agent cooperative/competitive scenarios and is compatible with mainstream models including GPT-4o, Gemini, Claude, and DeepSeek. Contribution/Results: Deployed on long-horizon delivery tasks, the platform reveals significant behavioral disparities across models in strategic reasoning, social interaction, and environmental adaptation. It establishes the first unified simulation foundation that simultaneously achieves high realism and scalability for training, evaluating, and real-world transfer of LLM/VLM agents.

Technology Category

Application Category

📝 Abstract

While LLM/VLM-powered AI agents have advanced rapidly in math, coding, and computer use, their applications in complex physical and social environments remain challenging. Building agents that can survive and thrive in the real world (for example, by autonomously earning income or running a business) requires massive-scale interaction, reasoning, training, and evaluation across diverse embodied scenarios. However, existing world simulators for such development fall short: they often rely on limited hand-crafted environments, simulate simplified game-like physics and social rules, and lack native support for LLM/VLM agents. We introduce SimWorld, a new simulator built on Unreal Engine 5, designed for developing and evaluating LLM/VLM agents in rich, real-world-like settings. SimWorld offers three core capabilities: (1) realistic, open-ended world simulation, including accurate physical and social dynamics and language-driven procedural environment generation; (2) a rich interface for LLM/VLM agents, with multimodal world inputs and open-vocabulary actions at varying levels of abstraction; and (3) diverse and extensible physical and social reasoning scenarios that are easily customizable by users. We demonstrate SimWorld by deploying frontier LLM agents (e.g., GPT-4o, Gemini-2.5-Flash, Claude-3.5, and DeepSeek-Prover-V2) on long-horizon multi-agent delivery tasks involving strategic cooperation and competition. The results reveal distinct reasoning patterns and limitations across models. We open-source SimWorld and hope it becomes a foundational platform for advancing real-world agent intelligence across disciplines: https://simworld.org.

Problem

Research questions and friction points this paper is trying to address.

Existing simulators lack realistic physical and social environments for AI agents

Current tools have limited support for LLM/VLM agents in complex scenarios

There is a need for scalable platforms to train agents for real-world tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

SimWorld uses Unreal Engine 5 for realistic physical and social simulation

It provides multimodal interfaces and open-vocabulary actions for LLM/VLM agents

The simulator supports customizable, procedurally generated environments for diverse scenarios

🔎 Similar Papers

No similar papers found.