Self-Evolving Multi-Agent Framework for Efficient Decision Making in Real-Time Strategy Scenarios

📅 2026-03-24

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

This work addresses the challenge of balancing decision speed and quality in real-time strategy (RTS) games, where large language models often suffer from high inference latency and inconsistent reasoning. To this end, the authors propose SEMA, a novel framework that integrates structure entropy–based dynamic observation pruning, a hybrid knowledge memory mechanism combining micro-level trajectories with macro-level experience, and a cross-episode self-evolutionary calibration strategy. By leveraging multi-agent collaboration and hierarchical domain knowledge distillation, SEMA enhances both decision efficiency and logical consistency. Experimental results on multiple StarCraft II maps demonstrate that SEMA significantly outperforms baseline methods, achieving over 50% reduction in average decision latency while simultaneously improving win rates, thereby validating its effectiveness and robustness in complex RTS environments.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) have demonstrated exceptional potential in complex reasoning,pioneering a new paradigm for autonomous agent decision making in dynamic settings. However, in Real-Time Strategy (RTS) scenarios, LLMs suffer from a critical speed-quality trade-off. Specifically expansive state spaces and time limits render inference delays prohibitive, while stochastic planning errors undermine logical consistency. To address these challenges, we present SEMA (Self-Evolving Multi-Agent), a novel framework designed for high-performance, low-latency decision-making in RTS environments. This collaborative multi-agent framework facilitates self-evolution by adaptively calibrating model bias through in-episode assessment and cross-episode analysis. We further incorporate dynamic observation pruning based on structural entropy to model game states topologically. By distilling high dimensional data into core semantic information, this approach significantly reduces inference time. We also develop a hybrid knowledge-memory mechanism that integrates micro-trajectories, macro-experience, and hierarchical domain knowledge, thereby enhancing both strategic adaptability and decision consistency. Experiments across multiple StarCraft II maps demonstrate that SEMA achieves superior win rates while reducing average decision latency by over 50%, validating its efficiency and robustness in complex RTS scenarios.

Problem

Research questions and friction points this paper is trying to address.

Real-Time Strategy

large language models

decision-making

inference latency

state space

Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-Evolving Multi-Agent

Dynamic Observation Pruning

Structural Entropy