🤖 AI Summary
Current large language models struggle to support iterative creative refinement in game generation, suffering from brittle behaviors, difficulty accumulating experience, subjective evaluation, and a lack of explicit modeling of game mechanics. This work proposes a mechanism-driven multi-agent system that, for the first time, treats game mechanics as core units amenable to planning, tracking, and evaluation. By integrating mechanism-guided planning loops, procedural signal rewards, cross-version lineage memory, and runtime validation, the system establishes an interpretable, evolutionary generation pipeline. Empirically, it encompasses 71 game lineages, 88 saved states, and 774 mechanistic entries, demonstrating traceable, mechanism-level innovation and enabling both architectural analysis and real-world case studies.
📝 Abstract
Large language models can generate plausible game code, but turning this capability into \emph{iterative creative improvement} remains difficult. In practice, single-shot generation often produces brittle runtime behavior, weak accumulation of experience across versions, and creativity scores that are too subjective to serve as reliable optimization signals. A further limitation is that mechanics are frequently treated only as post-hoc descriptions, rather than as explicit objects that can be planned, tracked, preserved, and evaluated during generation.
This report presents \textbf{CreativeGame}, a multi-agent system for iterative HTML5 game generation that addresses these issues through four coupled ideas: a proxy reward centered on programmatic signals rather than pure LLM judgment; lineage-scoped memory for cross-version experience accumulation; runtime validation integrated into both repair and reward; and a mechanic-guided planning loop in which retrieved mechanic knowledge is converted into an explicit mechanic plan before code generation begins. The goal is not merely to produce a playable artifact in one step, but to support interpretable version-to-version evolution.
The current system contains 71 stored lineages, 88 saved nodes, and a 774-entry global mechanic archive, implemented in 6{,}181 lines of Python together with inspection and visualization tooling. The system is therefore substantial enough to support architectural analysis, reward inspection, and real lineage-level case studies rather than only prompt-level demos.
A real 4-generation lineage shows that mechanic-level innovation can emerge in later versions and can be inspected directly through version-to-version records. The central contribution is therefore not only game generation, but a concrete pipeline for observing progressive evolution through explicit mechanic change.