🤖 AI Summary
Existing generative game models produce high-fidelity visuals and respond to player inputs but suffer from pervasive numerical inconsistency (e.g., score logic errors) and spatial inconsistency (e.g., abrupt scene discontinuities). To address this, we propose “Model as a Game” (MaaG), a novel paradigm that formally defines and jointly models both consistency types. Our approach introduces a LogicNet-driven numerical reasoning module and a map-aware spatial continuity framework. Technically, we extend the DiT architecture with an external logic computation interface, dynamic map maintenance, and position-aware retrieval. Evaluated on three generative games, MaaG significantly improves both numerical and spatial consistency metrics—measured via logic correctness and positional coherence—while incurring only marginal inference overhead. This work marks the first demonstration of unified high-quality generation and reliable, rule-governed gameplay mechanics in end-to-end generative game models.
📝 Abstract
Recent advances in generative models have significantly impacted game generation. However, despite producing high-quality graphics and adequately receiving player input, existing models often fail to maintain fundamental game properties such as numerical and spatial consistency. Numerical consistency ensures gameplay mechanics correctly reflect score changes and other quantitative elements, while spatial consistency prevents jarring scene transitions, providing seamless player experiences. In this paper, we revisit the paradigm of generative games to explore what truly constitutes a Model as a Game (MaaG) with a well-developed mechanism. We begin with an empirical study on ``Traveler'', a 2D game created by an LLM featuring minimalist rules yet challenging generative models in maintaining consistency. Based on the DiT architecture, we design two specialized modules: (1) a numerical module that integrates a LogicNet to determine event triggers, with calculations processed externally as conditions for image generation; and (2) a spatial module that maintains a map of explored areas, retrieving location-specific information during generation and linking new observations to ensure continuity. Experiments across three games demonstrate that our integrated modules significantly enhance performance on consistency metrics compared to baselines, while incurring minimal time overhead during inference.