🤖 AI Summary
Existing hierarchical reinforcement learning (HRL) approaches are largely restricted to two-level architectures or rely on centralized training, hindering scalability and decentralized deployment. This paper proposes the first fully decentralized, arbitrarily deep multi-agent HRL framework. Its core innovation is LevelEnv—a novel environment abstraction that enables cross-level loose coupling, standardized inter-layer interaction, and heterogeneous agent co-deployment. LevelEnv decouples hierarchical dependencies, thereby supporting distributed and asynchronous training. Evaluated on standard multi-agent benchmarks—including SMAC and MPE—our method achieves significant improvements in both sample efficiency and final policy performance. Empirical results demonstrate that decentralized hierarchical structuring delivers tangible gains in system adaptability and scalability, validating its effectiveness for large-scale, cooperative multi-agent settings.
📝 Abstract
Hierarchical organization is fundamental to biological systems and human societies, yet artificial intelligence systems often rely on monolithic architectures that limit adaptability and scalability. Current hierarchical reinforcement learning (HRL) approaches typically restrict hierarchies to two levels or require centralized training, which limits their practical applicability. We introduce TAME Agent Framework (TAG), a framework for constructing fully decentralized hierarchical multi-agent systems.TAG enables hierarchies of arbitrary depth through a novel LevelEnv concept, which abstracts each hierarchy level as the environment for the agents above it. This approach standardizes information flow between levels while preserving loose coupling, allowing for seamless integration of diverse agent types. We demonstrate the effectiveness of TAG by implementing hierarchical architectures that combine different RL agents across multiple levels, achieving improved performance over classical multi-agent RL baselines on standard benchmarks. Our results show that decentralized hierarchical organization enhances both learning speed and final performance, positioning TAG as a promising direction for scalable multi-agent systems.