🤖 AI Summary
This work addresses the two-stage Colonel Blotto game—a canonical networked adversarial resource allocation problem—by jointly modeling temporal dependencies between initial deployment and multi-round dynamic reallocation, as well as graph-topological constraints. We propose the first hierarchical graph Transformer architecture, integrating a structural bias encoder with a dual-agent hierarchical decision-making model, and design an inter-layer feedback reinforcement learning algorithm to explicitly capture two-level policy coordination. Compared to conventional hierarchical decision frameworks and graph neural networks, our approach significantly improves resource allocation efficiency and adversarial payoff in complex dynamic博弈 settings. Extensive experiments on multiple synthetic and real-world network topologies demonstrate its strong capability to approximate optimal strategies and its robust generalization across diverse graph structures.
📝 Abstract
Two-stage Colonel Blotto game represents a typical adversarial resource allocation problem, in which two opposing agents sequentially allocate resources in a network topology across two phases: an initial resource deployment followed by multiple rounds of dynamic reallocation adjustments. The sequential dependency between game stages and the complex constraints imposed by the graph topology make it difficult for traditional approaches to attain a globally optimal strategy. To address these challenges, we propose a hierarchical graph Transformer framework called HGformer. By incorporating an enhanced graph Transformer encoder with structural biases and a two-agent hierarchical decision model, our approach enables efficient policy generation in large-scale adversarial environments. Moreover, we design a layer-by-layer feedback reinforcement learning algorithm that feeds the long-term returns from lower-level decisions back into the optimization of the higher-level strategy, thus bridging the coordination gap between the two decision-making stages. Experimental results demonstrate that, compared to existing hierarchical decision-making or graph neural network methods, HGformer significantly improves resource allocation efficiency and adversarial payoff, achieving superior overall performance in complex dynamic game scenarios.