SimuRA: Towards General Goal-Oriented Agent via Simulative Reasoning Architecture with LLM-Based World Model

๐Ÿ“… 2025-07-31
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Current LLM-based agents predominantly adopt a โ€œone-task-one-agentโ€ paradigm, suffering from poor scalability, inherent limitations of autoregressive generation (e.g., sequential dependency), and insufficient explicit causal reasoning capability. To address these challenges, we propose SimuRAโ€”a general-purpose agent architecture centered on constructing an LLM-based world model and performing multi-step simulation-based reasoning within a natural language latent space, enabling human-like mental simulation for goal-directed planning. By decoupling planning from autoregressive token generation, SimuRA supports prospective decision-making and dynamic adaptation across diverse environments. Experimental results demonstrate that SimuRA increases task success rate from 0% to 32.2% on a web-based flight search benchmark; moreover, its world-model-driven planning achieves a 124% improvement in planning efficacy over conventional approaches. These advances significantly contribute to the development of general, goal-oriented autonomous agents.

Technology Category

Application Category

๐Ÿ“ Abstract
AI agents built on large language models (LLMs) hold enormous promise, but current practice focuses on a one-task-one-agent approach, which not only falls short of scalability and generality, but also suffers from the fundamental limitations of autoregressive LLMs. On the other hand, humans are general agents who reason by mentally simulating the outcomes of their actions and plans. Moving towards a more general and powerful AI agent, we introduce SimuRA, a goal-oriented architecture for generalized agentic reasoning. Based on a principled formulation of optimal agent in any environment, modelname overcomes the limitations of autoregressive reasoning by introducing a world model for planning via simulation. The generalized world model is implemented using LLM, which can flexibly plan in a wide range of environments using the concept-rich latent space of natural language. Experiments on difficult web browsing tasks show that modelname improves the success of flight search from 0% to 32.2%. World-model-based planning, in particular, shows consistent advantage of up to 124% over autoregressive planning, demonstrating the advantage of world model simulation as a reasoning paradigm. We are excited about the possibility for training a single, general agent model based on LLMs that can act superintelligently in all environments. To start, we make SimuRA, a web-browsing agent built on modelname with pretrained LLMs, available as a research demo for public testing.
Problem

Research questions and friction points this paper is trying to address.

Overcoming limitations of one-task-one-agent approach in LLMs
Enabling general goal-oriented reasoning via world model simulation
Improving success rates in complex tasks like web browsing
Innovation

Methods, ideas, or system contributions that make the work stand out.

Simulative reasoning architecture for general agents
LLM-based world model for flexible planning
Web browsing agent with pretrained LLMs
๐Ÿ”Ž Similar Papers
No similar papers found.