Reflection of Episodes: Learning to Play Game from Expert and Self Experiences

📅 2025-02-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of enabling large language models (LLMs) to autonomously evolve strategic policies through self-reflection in complex, dynamic real-time strategy (RTS) environments, this paper proposes a reflection-based learning framework integrating expert and self-acquired experience for StarCraft II. Methodologically, it introduces the first episode-level reflection learning paradigm, comprising key-frame-driven experience extraction, dual-source (expert + self) experience fusion for decision-making, and an LLM-driven posterior reflection mechanism—evaluated in the TextStarCraft II simulation environment. Experiments demonstrate stable victory over the built-in AI under the “Very Hard” difficulty setting; process analysis confirms autonomous, iterative policy refinement. The core contributions are: (1) establishing a novel episode-level reflection learning paradigm for LLMs in RTS domains, and (2) designing a scalable, multi-source experience co-evolution mechanism that synergistically integrates heterogeneous knowledge sources.

Technology Category

Application Category

📝 Abstract
StarCraft II is a complex and dynamic real-time strategy (RTS) game environment, which is very suitable for artificial intelligence and reinforcement learning research. To address the problem of Large Language Model(LLM) learning in complex environments through self-reflection, we propose a Reflection of Episodes(ROE) framework based on expert experience and self-experience. This framework first obtains key information in the game through a keyframe selection method, then makes decisions based on expert experience and self-experience. After a game is completed, it reflects on the previous experience to obtain new self-experience. Finally, in the experiment, our method beat the robot under the Very Hard difficulty in TextStarCraft II. We analyze the data of the LLM in the process of the game in detail, verified its effectiveness.
Problem

Research questions and friction points this paper is trying to address.

LLM learning in complex environments
Self-reflection in RTS games
Combining expert and self-experience for decision-making
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reflection of Episodes framework
Keyframe selection method
Expert and self-experience integration
🔎 Similar Papers
No similar papers found.
X
Xiaojie Xu
College of Artificial Intelligence, Nankai University, Tianjing, China
Z
Zongyuan Li
College of Artificial Intelligence, Nankai University, Tianjing, China
C
Chang Lu
College of Artificial Intelligence, Nankai University, Tianjing, China
R
Runnan Qi
Laboratory for Big Data and Decision, National University of Defense Technology, Changsha, China
Y
Yanan Ni
Laboratory for Big Data and Decision, National University of Defense Technology, Changsha, China
L
Lumin Jiang
Laboratory for Big Data and Decision, National University of Defense Technology, Changsha, China
X
Xiangbei Liu
College of Artificial Intelligence, Nankai University, Tianjing, China
Xuebo Zhang
Xuebo Zhang
Ph. D, Professor, Institute of Robotics, Nankai Univeristy, China
Visual servoingmobile roboticsmotion planningSLAMgame AI
Yongchun Fang
Yongchun Fang
Nankai University
Visual ServoingNonlinear ControlAtomic Force Microscope
K
Kuihua Huang
Laboratory for Big Data and Decision, National University of Defense Technology, Changsha, China
X
Xian Guo
College of Artificial Intelligence, Nankai University, Tianjing, China
Z
Zhanghua Wu
Jiangsu Automation Research Institute Jiangsu, China
Z
Zhenya Li
Laboratory for Big Data and Decision, National University of Defense Technology, Changsha, China