Sable: a Performant, Efficient and Scalable Sequence Model for MARL

๐Ÿ“… 2024-10-02
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Addressing the challenge of jointly optimizing performance, memory efficiency, and scalability in multi-agent reinforcement learning (MARL), this paper proposes Sableโ€”a novel sequential model. Methodologically, Sable pioneers the adaptation of Retentive Networksโ€™ retention mechanism to MARL temporal modeling, integrating lightweight state compression, distributed sequence attention approximation, and multi-agent temporal embedding alignment for efficient long-horizon contextual modeling. Empirically, Sable achieves state-of-the-art (SOTA) performance on 34 out of 45 tasks across six diverse MARL benchmark environments. It scales to over 1,000 agents with linear memory growth and maintains stable performance under large-scale deployment. Ablation studies confirm that each component meaningfully enhances both computational efficiency and representational capacity.

Technology Category

Application Category

๐Ÿ“ Abstract
As multi-agent reinforcement learning (MARL) progresses towards solving larger and more complex problems, it becomes increasingly important that algorithms exhibit the key properties of (1) strong performance, (2) memory efficiency and (3) scalability. In this work, we introduce Sable, a performant, memory efficient and scalable sequence modeling approach to MARL. Sable works by adapting the retention mechanism in Retentive Networks to achieve computationally efficient processing of multi-agent observations with long context memory for temporal reasoning. Through extensive evaluations across six diverse environments, we demonstrate how Sable is able to significantly outperform existing state-of-the-art methods in a large number of diverse tasks (34 out of 45 tested). Furthermore, Sable maintains performance as we scale the number of agents, handling environments with more than a thousand agents while exhibiting a linear increase in memory usage. Finally, we conduct ablation studies to isolate the source of Sable's performance gains and confirm its efficient computational memory usage.
Problem

Research questions and friction points this paper is trying to address.

Enhances MARL with performance and efficiency
Scales effectively with increasing agent numbers
Improves temporal reasoning in multi-agent systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adapts retention mechanism
Efficient multi-agent processing
Scales linearly with agents
๐Ÿ”Ž Similar Papers
2024-03-04Computer Vision and Pattern RecognitionCitations: 3