ScriptDoctor: Automatic Generation of PuzzleScript Games via Large Language Models and Tree Search

📅 2025-06-06

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

Automating end-to-end procedural game design—particularly for constrained domain-specific languages like PuzzleScript—remains challenging due to stringent syntactic and semantic requirements, necessitating human-in-the-loop validation and iteration. Method: We propose an LLM-engine closed-loop paradigm: (1) an LLM generates initial game logic from few-shot examples; (2) a compiler provides targeted feedback to iteratively repair syntax and logical errors; and (3) a tree-search agent autonomously conducts gameplay simulation and solvability verification. Contribution/Results: This is the first fully automated, human-free pipeline integrating concept generation, code correction, and playability assessment for puzzle games. Evaluated on PuzzleScript, our approach produces numerous syntactically valid, logically solvable, and novel maze games. It demonstrates that LLMs, when tightly coupled with domain-specific engines and search-based verification, can autonomously evolve functional game designs under strong formal constraints—advancing LLM capabilities in constrained program synthesis and creative AI.

Technology Category

Application Category

📝 Abstract

There is much interest in using large pre-trained models in Automatic Game Design (AGD), whether via the generation of code, assets, or more abstract conceptualization of design ideas. But so far this interest largely stems from the ad hoc use of such generative models under persistent human supervision. Much work remains to show how these tools can be integrated into longer-time-horizon AGD pipelines, in which systems interface with game engines to test generated content autonomously. To this end, we introduce ScriptDoctor, a Large Language Model (LLM)-driven system for automatically generating and testing games in PuzzleScript, an expressive but highly constrained description language for turn-based puzzle games over 2D gridworlds. ScriptDoctor generates and tests game design ideas in an iterative loop, where human-authored examples are used to ground the system's output, compilation errors from the PuzzleScript engine are used to elicit functional code, and search-based agents play-test generated games. ScriptDoctor serves as a concrete example of the potential of automated, open-ended LLM-based workflows in generating novel game content.

Problem

Research questions and friction points this paper is trying to address.

Automating game design using large language models

Integrating generative models into autonomous game creation pipelines

Generating and testing PuzzleScript games without human supervision

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-driven system for PuzzleScript game generation

Iterative loop with human-authored examples grounding

Search-based agents for autonomous play-testing

🔎 Similar Papers

No similar papers found.