🤖 AI Summary
This work addresses the challenge of maintaining visual consistency across multiple rarity tiers in game UI design—a process currently reliant on manual effort. The authors propose the first LLM-driven visual generation agent framework tailored for game production, which translates natural language descriptions into editable Figma designs through a six-stage neuro-symbolic pipeline. Intermediate representations are structured as Design Spec JSON, and an iterative self-correction mechanism is enabled by a vision-language model-based reflection controller. The study introduces the “quality ceiling effect” and the “rendering-evaluation fidelity principle,” forming a taxonomy of failure modes in game UI generation. Experiments across 110 test cases, three LLMs, and three UI templates reveal that performance gains are constrained by the available improvement margin below a fidelity threshold, and that localized rendering enhancements can inadvertently lower evaluation scores by amplifying structural flaws.
📝 Abstract
Game UI design requires consistent visual assets across rarity tiers yet remains a predominantly manual process. We present GameUIAgent, an LLM-powered agentic framework that translates natural language descriptions into editable Figma designs via a Design Spec JSON intermediate representation. A six-stage neuro-symbolic pipeline combines LLM generation, deterministic post-processing, and a Vision-Language Model (VLM)-guided Reflection Controller (RC) for iterative self-correction with guaranteed non-regressive quality. Evaluated across 110 test cases, three LLMs, and three UI templates, cross-model analysis establishes a game-domain failure taxonomy (rarity-dependent degradation; visual emptiness) and uncovers two key empirical findings. A Quality Ceiling Effect (Pearson r=-0.96, p<0.01) suggests that RC improvement is bounded by headroom below a quality threshold -- a visual-domain counterpart to test-time compute scaling laws. A Rendering-Evaluation Fidelity Principle reveals that partial rendering enhancements paradoxically degrade VLM evaluation by amplifying structural defects. Together, these results establish foundational principles for LLM-driven visual generation agents in game production.