Efficient Diversity-based Experience Replay for Deep Reinforcement Learning

📅 2024-10-27
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the limited experience diversity and low sampling efficiency of conventional experience replay in high-dimensional, complex environments—such as real-world robotic manipulation and 3D indoor navigation—this paper proposes a novel experience replay framework based on Determinantal Point Processes (DPPs). We are the first to formulate DPPs for quantifying experience diversity and performing diversity-aware prioritization. To ensure scalability, we employ Cholesky decomposition to accelerate kernel matrix computation, and integrate rejection sampling to enable efficient, unbiased sampling in high-dimensional state spaces. Evaluated on MuJoCo continuous control benchmarks, Atari discrete games, and Habitat-based realistic indoor navigation tasks, our method significantly improves sample efficiency and final policy performance. Empirical results demonstrate strong generalization across diverse domains and practical applicability to real-world embodied AI challenges.

Technology Category

Application Category

📝 Abstract
Experience replay is widely used to improve learning efficiency in reinforcement learning by leveraging past experiences. However, existing experience replay methods, whether based on uniform or prioritized sampling, often suffer from low efficiency, particularly in real-world scenarios with high-dimensional state spaces. To address this limitation, we propose a novel approach, Efficient Diversity-based Experience Replay (EDER). EDER employs a deterministic point process to model the diversity between samples and prioritizes replay based on the diversity between samples. To further enhance learning efficiency, we incorporate Cholesky decomposition for handling large state spaces in realistic environments. Additionally, rejection sampling is applied to select samples with higher diversity, thereby improving overall learning efficacy. Extensive experiments are conducted on robotic manipulation tasks in MuJoCo, Atari games, and realistic indoor environments in Habitat. The results demonstrate that our approach not only significantly improves learning efficiency but also achieves superior performance in high-dimensional, realistic environments.
Problem

Research questions and friction points this paper is trying to address.

Reinforcement Learning
Experience Replay
Complex Environment
Innovation

Methods, ideas, or system contributions that make the work stand out.

EDER
Cholesky decomposition
Rejection sampling
🔎 Similar Papers
No similar papers found.
Kaiyan Zhao
Kaiyan Zhao
The University of Tokyo
Natural Language Processing
Y
Yiming Wang
State Key Laboratory of Internet of Things for Smart City, University of Macau, Macao, China
Y
Yuyang Chen
Northwestern University, Evanston, IL, USA
X
Xiaoguang Niu
School of Computer Science, Wuhan University, Wuhan, China
Y
Yan Li
State Key Laboratory of Internet of Things for Smart City, University of Macau, Macao, China
Leong Hou U
Leong Hou U
University of Macau
Spatial and Spatio-Temporal DatabasesData VisualizationGraph LearningReinforcement Learning