Extending NGU to Multi-Agent RL: A Preliminary Study

📅 2025-12-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of coordinated exploration in multi-agent reinforcement learning (MARL) under sparse rewards. We propose the first extension of the Never Give Up (NGU) algorithm to the multi-agent setting, building upon multi-agent DQN and integrating an episode-level novelty-based intrinsic reward mechanism. Our method systematically investigates three key design choices: (i) a shared replay buffer across agents, (ii) cross-agent novelty sharing—parameterized by a tunable *k*—and (iii) heterogeneous β parameters for individualized intrinsic reward scaling. Experiments in the PettingZoo *simple_tag* environment demonstrate that a shared replay buffer significantly improves both performance and training stability; novelty sharing with *k* = 1 enhances exploratory behavior; and heterogeneous β yields no measurable benefit. The core contribution is the empirical validation of synergistic intrinsic motivation and experience sharing, establishing a scalable and robust NGU-inspired paradigm for tackling sparse-reward problems in MARL.

Technology Category

Application Category

📝 Abstract
The Never Give Up (NGU) algorithm has proven effective in reinforcement learning tasks with sparse rewards by combining episodic novelty and intrinsic motivation. In this work, we extend NGU to multi-agent environments and evaluate its performance in the simple_tag environment from the PettingZoo suite. Compared to a multi-agent DQN baseline, NGU achieves moderately higher returns and more stable learning dynamics. We investigate three design choices: (1) shared replay buffer versus individual replay buffers, (2) sharing episodic novelty among agents using different k thresholds, and (3) using heterogeneous values of the beta parameter. Our results show that NGU with a shared replay buffer yields the best performance and stability, highlighting that the gains come from combining NGU intrinsic exploration with experience sharing. Novelty sharing performs comparably when k = 1 but degrades learning for larger values. Finally, heterogeneous beta values do not improve over a small common value. These findings suggest that NGU can be effectively applied in multi-agent settings when experiences are shared and intrinsic exploration signals are carefully tuned.
Problem

Research questions and friction points this paper is trying to address.

Extends NGU algorithm to multi-agent reinforcement learning environments.
Evaluates performance in simple_tag environment against DQN baseline.
Investigates design choices like shared replay buffers and novelty sharing.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Extends NGU to multi-agent reinforcement learning
Uses shared replay buffer for experience sharing
Tunes intrinsic exploration signals for stability
🔎 Similar Papers
No similar papers found.