Adaptive Context Length Optimization with Low-Frequency Truncation for Multi-Agent Reinforcement Learning

📅 2025-10-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address inefficient exploration and information redundancy caused by fixed context lengths in multi-agent reinforcement learning (MARL), this paper proposes an adaptive context length optimization framework. The method introduces a centralized agent that dynamically adjusts the input sequence length for each agent based on temporal gradient analysis. Additionally, it employs Fourier low-frequency truncation to extract denoised global temporal trends, enabling efficient state representation. This design jointly preserves local responsiveness and ensures global convergence stability. Evaluated across diverse long-horizon benchmark environments—including PettingZoo, MiniGrid, Google Research Football (GRF), and SMACv2—the approach significantly improves sample efficiency and final policy performance, achieving state-of-the-art results.

Technology Category

Application Category

📝 Abstract
Recently, deep multi-agent reinforcement learning (MARL) has demonstrated promising performance for solving challenging tasks, such as long-term dependencies and non-Markovian environments. Its success is partly attributed to conditioning policies on large fixed context length. However, such large fixed context lengths may lead to limited exploration efficiency and redundant information. In this paper, we propose a novel MARL framework to obtain adaptive and effective contextual information. Specifically, we design a central agent that dynamically optimizes context length via temporal gradient analysis, enhancing exploration to facilitate convergence to global optima in MARL. Furthermore, to enhance the adaptive optimization capability of the context length, we present an efficient input representation for the central agent, which effectively filters redundant information. By leveraging a Fourier-based low-frequency truncation method, we extract global temporal trends across decentralized agents, providing an effective and efficient representation of the MARL environment. Extensive experiments demonstrate that the proposed method achieves state-of-the-art (SOTA) performance on long-term dependency tasks, including PettingZoo, MiniGrid, Google Research Football (GRF), and StarCraft Multi-Agent Challenge v2 (SMACv2).
Problem

Research questions and friction points this paper is trying to address.

Optimizing adaptive context length to improve multi-agent reinforcement learning efficiency
Filtering redundant information via low-frequency truncation for better environment representation
Solving long-term dependency challenges in non-Markovian multi-agent environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic context length optimization via temporal gradient analysis
Fourier-based low-frequency truncation for information filtering
Central agent extracts global temporal trends from decentralized agents
🔎 Similar Papers
No similar papers found.
W
Wenchang Duan
Shanghai Jiao Tong University
Yaoliang Yu
Yaoliang Yu
University of Waterloo
Machine learningOptimization
J
Jiwan He
Shanghai Jiao Tong University
Y
Yi Shi
Shanghai Jiao Tong University