OpenGame: Open Agentic Coding for Games

πŸ“… 2026-04-20
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

244K/year
πŸ€– AI Summary
Existing large language models and code agents struggle to generate complete, playable web games from high-level design specifications, often failing due to cross-file inconsistencies, erroneous scene transitions, and logical incoherence. This work proposes an end-to-end game development agent framework featuring a novel Game Skill mechanism, an evolvable template library, and a validation-and-repair protocol. It also introduces OpenGame-Bench, the first interactive benchmark for evaluating generated games. Built upon GameCoder-27Bβ€”a specialized code large language modelβ€”the framework leverages continual pretraining, supervised fine-tuning, and execution-driven reinforcement learning, with automatic evaluation enabled by headless browsers and vision-language models. Evaluated on 150 diverse tasks, the approach achieves a new state of the art, significantly outperforming existing methods in architectural soundness, visual usability, and intent alignment.

Technology Category

Application Category

πŸ“ Abstract
Game development sits at the intersection of creative design and intricate software engineering, demanding the joint orchestration of game engines, real-time loops, and tightly coupled state across many files. While Large Language Models (LLMs) and code agents now solve isolated programming tasks with ease, they consistently stumble when asked to produce a fully playable game from a high-level design, collapsing under cross-file inconsistencies, broken scene wiring, and logical incoherence. We bridge this gap with OpenGame, the first open-source agentic framework explicitly designed for end-to-end web game creation. At its core lies Game Skill, a reusable, evolving capability composed of a Template Skill that grows a library of project skeletons from experience and a Debug Skill that maintains a living protocol of verified fixes - together enabling the agent to scaffold stable architectures and systematically repair integration errors rather than patch isolated syntax bugs. Powering this framework is GameCoder-27B, a code LLM specialized for game engine mastery through a three-stage pipeline of continual pre-training, supervised fine-tuning, and execution-grounded reinforcement learning. Since verifying interactive playability is fundamentally harder than checking static code, we further introduce OpenGame-Bench, an evaluation pipeline that scores agentic game generation along Build Health, Visual Usability, and Intent Alignment via headless browser execution and VLM judging. Across 150 diverse game prompts, OpenGame establishes a new state-of-the-art. We hope OpenGame pushes code agents beyond discrete software engineering problems and toward building complex, interactive real-world applications. Our framework will be fully open-sourced.
Problem

Research questions and friction points this paper is trying to address.

game development
code agents
Large Language Models
cross-file consistency
interactive playability
Innovation

Methods, ideas, or system contributions that make the work stand out.

agentic coding
game development
code LLM
debug skill
evaluation benchmark
πŸ”Ž Similar Papers