Reducing Action Space for Deep Reinforcement Learning via Causal Effect Estimation

πŸ“… 2025-01-24
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address inefficient exploration caused by large redundant action spaces in deep reinforcement learning, this paper proposes a causality-driven action space pruning method. Our approach introduces causal inference into action selection for the first time: it leverages a pretrained inverse dynamics model and full-action classification to construct an interpretable, quantitative, single-step causal effect estimator for actions, enabling dynamic suppression of ineffective actions. Unlike conventional heuristic or penalty-based redundancy elimination paradigms, our method is theoretically grounded and provably improves exploration efficiency. Experiments across multiple simulated environments with diverse redundancy patterns demonstrate significant gainsβ€”sample efficiency and convergence speed are markedly enhanced, while the number of ineffective action attempts decreases by over 40%. Moreover, the method exhibits strong generalization capability across tasks and environments.

Technology Category

Application Category

πŸ“ Abstract
Intelligent decision-making within large and redundant action spaces remains challenging in deep reinforcement learning. Considering similar but ineffective actions at each step can lead to repetitive and unproductive trials. Existing methods attempt to improve agent exploration by reducing or penalizing redundant actions, yet they fail to provide quantitative and reliable evidence to determine redundancy. In this paper, we propose a method to improve exploration efficiency by estimating the causal effects of actions. Unlike prior methods, our approach offers quantitative results regarding the causality of actions for one-step transitions. We first pre-train an inverse dynamics model to serve as prior knowledge of the environment. Subsequently, we classify actions across the entire action space at each time step and estimate the causal effect of each action to suppress redundant actions during exploration. We provide a theoretical analysis to demonstrate the effectiveness of our method and present empirical results from simulations in environments with redundant actions to evaluate its performance. Our implementation is available at https://github.com/agi-brain/cee.git.
Problem

Research questions and friction points this paper is trying to address.

Repetitive Actions
Deep Learning
Efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Causal Impact Computation
Action Redundancy Reduction
Deep Learning Efficiency
πŸ”Ž Similar Papers
No similar papers found.
W
Wenzhang Liu
School of Artificial Intelligence, Anhui University; Engineering Research Center of Autonomous Unmanned System Technology, Ministry of Education
L
Lianjun Jin
School of Artificial Intelligence, Anhui University
L
Lu Ren
School of Artificial Intelligence, Anhui University
Chaoxu Mu
Chaoxu Mu
Tianjin University
Nonlinear system control and optimizationAdaptive and learning systemsand Smart grid
C
Changyin Sun
School of Artificial Intelligence, Anhui University; Engineering Research Center of Autonomous Unmanned System Technology, Ministry of Education