Agent2World: Learning to Generate Symbolic World Models via Adaptive Multi-Agent Feedback

📅 2025-12-26

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

Symbolic world models (e.g., PDDL, executable simulators) suffer from a scarcity of large-scale, verifiable supervision data; existing approaches rely on static validation, failing to detect behavior-level errors arising during interactive execution. Method: We propose a three-stage adaptive multi-agent closed-loop framework—Knowledge Synthesis → Model Development → Behavior-Aware Testing—that integrates interactive testing environments directly into the training pipeline to generate multi-turn, behavior-aligned supervision trajectories. The method combines tool-augmented multi-agent collaboration, web retrieval augmentation, executable simulation-based verification, unit-test-driven adaptive feedback, and supervised fine-tuning (SFT). Contribution/Results: Our approach achieves state-of-the-art performance across three major benchmarks spanning both PDDL and code-based world modeling paradigms, with an average relative improvement of 30.95%. It significantly enhances behavioral correctness and generalization capability of symbolic world models.

Technology Category

Application Category

📝 Abstract

Symbolic world models (e.g., PDDL domains or executable simulators) are central to model-based planning, but training LLMs to generate such world models is limited by the lack of large-scale verifiable supervision. Current approaches rely primarily on static validation methods that fail to catch behavior-level errors arising from interactive execution. In this paper, we propose Agent2World, a tool-augmented multi-agent framework that achieves strong inference-time world-model generation and also serves as a data engine for supervised fine-tuning, by grounding generation in multi-agent feedback. Agent2World follows a three-stage pipeline: (i) A Deep Researcher agent performs knowledge synthesis by web searching to address specification gaps; (ii) A Model Developer agent implements executable world models; And (iii) a specialized Testing Team conducts adaptive unit testing and simulation-based validation. Agent2World demonstrates superior inference-time performance across three benchmarks spanning both Planning Domain Definition Language (PDDL) and executable code representations, achieving consistent state-of-the-art results. Beyond inference, Testing Team serves as an interactive environment for the Model Developer, providing behavior-aware adaptive feedback that yields multi-turn training trajectories. The model fine-tuned on these trajectories substantially improves world-model generation, yielding an average relative gain of 30.95% over the same model before training. Project page: https://agent2world.github.io.

Problem

Research questions and friction points this paper is trying to address.

Generating symbolic world models lacks large-scale verifiable supervision for training LLMs

Static validation methods fail to detect behavior-level errors during interactive execution

Current approaches struggle with accurate inference-time world-model generation and training data creation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent framework for symbolic world model generation

Adaptive testing and simulation-based validation for behavior-level errors

Supervised fine-tuning using multi-turn feedback trajectories

🔎 Similar Papers

No similar papers found.