🤖 AI Summary
Modern games undergo frequent updates, leading to low testing efficiency and poor precision in impact analysis. To address this, we propose KLPEG—a novel framework that deeply integrates knowledge graphs (KGs) with large language models (LLMs) for automated game testing. KLPEG constructs a structured KG encoding game elements and their causal relationships, enabling cross-version knowledge accumulation and reuse; it further leverages LLMs to parse update logs and perform multi-hop reasoning for precise impact localization and generation of customized test cases. Experiments on Overcooked and Minecraft demonstrate that KLPEG significantly improves update impact identification accuracy and test coverage while reducing redundant execution steps—outperforming baseline methods in both testing efficiency and effectiveness. Our core contributions are: (1) a KG-augmented LLM reasoning mechanism that enhances semantic understanding and logical inference, and (2) a structured knowledge management paradigm specifically designed for incremental game updates.
📝 Abstract
The rapid iteration and frequent updates of modern video games pose significant challenges to the efficiency and specificity of testing. Although automated playtesting methods based on Large Language Models (LLMs) have shown promise, they often lack structured knowledge accumulation mechanisms, making it difficult to conduct precise and efficient testing tailored for incremental game updates. To address this challenge, this paper proposes a KLPEG framework. The framework constructs and maintains a Knowledge Graph (KG) to systematically model game elements, task dependencies, and causal relationships, enabling knowledge accumulation and reuse across versions. Building on this foundation, the framework utilizes LLMs to parse natural language update logs, identify the scope of impact through multi-hop reasoning on the KG, enabling the generation of update-tailored test cases. Experiments in two representative game environments, Overcooked and Minecraft, demonstrate that KLPEG can more accurately locate functionalities affected by updates and complete tests in fewer steps, significantly improving both playtesting effectiveness and efficiency.