🤖 AI Summary
To address the failure of large language models (LLMs) in multi-level decision-making within TextStarCraft II—stemming from domain knowledge gaps and inadequate subtask prioritization—this paper proposes the Hierarchical Expert Prompting (HEP) framework. HEP structurally injects tactical domain knowledge and introduces an importance-aware task decomposition mechanism coupled with dynamic weighted scheduling, enabling precise modeling of complex real-time strategic behaviors. For the first time, an LLM equipped with HEP achieves stable victories over the highest-difficulty (Elite) built-in AI across all difficulty levels in TextStarCraft II benchmark evaluations, significantly outperforming existing baselines. The source code, prompt templates, and battle replays are fully open-sourced, establishing a reproducible and extensible paradigm for LLM-driven complex game AI.
📝 Abstract
Since the emergence of the Large Language Model (LLM), LLM has been widely used in fields such as writing, translating, and searching. However, there is still great potential for LLM-based methods in handling complex tasks such as decision-making in the StarCraft II environment. To address problems such as lack of relevant knowledge and poor control over subtasks of varying importance, we propose a Hierarchical Expert Prompt (HEP) for LLM. Our method improves the understanding of game situations through expert-level tactical knowledge, improving the processing quality of tasks of varying importance through a hierarchical framework. Our approach defeated the highest level (Elite) standard built-in agent in TextStarCraft II for the first time and consistently outperformed the baseline method in other difficulties. Our experiments suggest that the proposed method is a practical solution for tackling complex decision-making challenges. The replay video can be viewed on https://www.bilibili.com/video/BV1uz42187EF and https://youtu.be/dO3PshWLV5M, and our codes have been open-sourced on https://github.com/luchang1113/HEP-LLM-play-StarCraftII.