Web World Models

📅 2025-12-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing world models for language agents face a dichotomy: fixed environments lack openness, while generative ones sacrifice logical consistency. This paper proposes a hybrid “code-as-skeleton, model-as-soul” paradigm, introducing standard web technologies—HTML/CSS/JS, RESTful APIs, and TypeScript interfaces—as a verifiable, scalable infrastructure for world modeling. Here, large language models (LLMs) perform narrative generation, reasoning, and decision-making over deterministic, code-defined states and rules. We further introduce typed latent-state modeling and deterministic content generation to ensure structural integrity and strong controllability. We deploy diverse networked world model (WWM) instances—including a geographic travel graph, a galactic exploration world, and an encyclopedic narrative world—demonstrating concurrent improvements in logical reliability, engineering maintainability, and generative openness.

Technology Category

Application Category

📝 Abstract
Language agents increasingly require persistent worlds in which they can act, remember, and learn. Existing approaches sit at two extremes: conventional web frameworks provide reliable but fixed contexts backed by databases, while fully generative world models aim for unlimited environments at the expense of controllability and practical engineering. In this work, we introduce the Web World Model (WWM), a middle ground where world state and ``physics'' are implemented in ordinary web code to ensure logical consistency, while large language models generate context, narratives, and high-level decisions on top of this structured latent state. We build a suite of WWMs on a realistic web stack, including an infinite travel atlas grounded in real geography, fictional galaxy explorers, web-scale encyclopedic and narrative worlds, and simulation- and game-like environments. Across these systems, we identify practical design principles for WWMs: separating code-defined rules from model-driven imagination, representing latent state as typed web interfaces, and utilizing deterministic generation to achieve unlimited but structured exploration. Our results suggest that web stacks themselves can serve as a scalable substrate for world models, enabling controllable yet open-ended environments. Project Page: https://github.com/Princeton-AI2-Lab/Web-World-Models.
Problem

Research questions and friction points this paper is trying to address.

Creating persistent, controllable worlds for language agents
Balancing structured web frameworks with generative world models
Implementing scalable, open-ended environments using web stacks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Web code implements world state and physics
LLMs generate context and decisions on structured state
Deterministic generation enables unlimited structured exploration
🔎 Similar Papers
No similar papers found.
J
Jichen Feng
Princeton University, University of Pennsylvania
Y
Yifan Zhang
Princeton University
C
Chenggong Zhang
University of California, Los Angeles
Yifu Lu
Yifu Lu
Undergraduate, University of Michigan
Computer Science
Shilong Liu
Shilong Liu
RS@ByteDance, PhD@THU
Computer VisionObject DetectionVisual GroundingMulti-ModalityMultimodal Agent
M
Mengdi Wang
Princeton University