Hierarchical Reinforcement Learning with Targeted Causal Interventions

📅 2025-07-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In hierarchical reinforcement learning (HRL), automatically discovering and effectively leveraging subgoal hierarchies remains challenging under long-horizon tasks with sparse rewards. Method: This paper proposes a causal-graph-based HRL framework that, for the first time, integrates causal discovery algorithms into subgoal modeling to automatically construct a directed causal graph over subgoals. It further introduces a theoretically grounded directional causal intervention mechanism, enabling the high-level policy to select and execute subgoals in an interpretable, low-variance manner. The framework accommodates both tree-structured and general directed acyclic graph (DAG) hierarchies, supporting rigorous theoretical analysis. Contribution/Results: Experiments on multiple standard HRL benchmarks demonstrate that the proposed method significantly outperforms existing baselines, improving sample efficiency by 30–50%. These results validate the dual benefits of causal structural modeling and intervention—enhancing both performance and interpretability in HRL.

Technology Category

Application Category

📝 Abstract
Hierarchical reinforcement learning (HRL) improves the efficiency of long-horizon reinforcement-learning tasks with sparse rewards by decomposing the task into a hierarchy of subgoals. The main challenge of HRL is efficient discovery of the hierarchical structure among subgoals and utilizing this structure to achieve the final goal. We address this challenge by modeling the subgoal structure as a causal graph and propose a causal discovery algorithm to learn it. Additionally, rather than intervening on the subgoals at random during exploration, we harness the discovered causal model to prioritize subgoal interventions based on their importance in attaining the final goal. These targeted interventions result in a significantly more efficient policy in terms of the training cost. Unlike previous work on causal HRL, which lacked theoretical analysis, we provide a formal analysis of the problem. Specifically, for tree structures and, for a variant of Erdős-Rényi random graphs, our approach results in remarkable improvements. Our experimental results on HRL tasks also illustrate that our proposed framework outperforms existing work in terms of training cost.
Problem

Research questions and friction points this paper is trying to address.

Discover hierarchical subgoal structure efficiently
Utilize causal graphs for subgoal prioritization
Improve training cost with targeted interventions
Innovation

Methods, ideas, or system contributions that make the work stand out.

HRL with causal graph modeling subgoal structure
Targeted interventions prioritize important subgoals
Formal analysis for tree and random graphs
🔎 Similar Papers
No similar papers found.
S
Sadegh Khorasani
School of Computer and Communication Sciences, EPFL, Lausanne, Switzerland
Saber Salehkaleybar
Saber Salehkaleybar
Leiden University
Causal InferenceStochastic OptimizationReinforcement Learning
Negar Kiyavash
Negar Kiyavash
École polytechnique fédérale de Lausanne (EPFL)
causalityapplied probabilitynetwork forensicsrandom graphstime series
M
Matthias Grossglauser
School of Computer and Communication Sciences, EPFL, Lausanne, Switzerland