Automating Curriculum Learning for Reinforcement Learning using a Skill-Based Bayesian Network

📅 2025-02-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Automatic curriculum generation in reinforcement learning faces challenges including slow convergence, poor generalization across tasks, and limited interpretability. To address these, we propose the Skill-Environment Bayesian Network (SEBN), the first probabilistic graphical model that jointly represents skills, reward objectives, and environmental features—enabling interpretable curriculum design, cross-task generalization, and principled uncertainty quantification. Building upon SEBN, we develop an expectation-based task selection algorithm that dynamically predicts policy performance on unseen tasks and selects the optimal training sequence to maximize expected performance gain. We evaluate our method across three distinct domains: discrete grid worlds, continuous control benchmarks, and simulated robotic manipulation. Compared to hand-crafted curricula and state-of-the-art baselines, our approach improves training efficiency by 32% on average and enhances final policy performance by 18–27%.

Technology Category

Application Category

📝 Abstract
A major challenge for reinforcement learning is automatically generating curricula to reduce training time or improve performance in some target task. We introduce SEBNs (Skill-Environment Bayesian Networks) which model a probabilistic relationship between a set of skills, a set of goals that relate to the reward structure, and a set of environment features to predict policy performance on (possibly unseen) tasks. We develop an algorithm that uses the inferred estimates of agent success from SEBN to weigh the possible next tasks by expected improvement. We evaluate the benefit of the resulting curriculum on three environments: a discrete gridworld, continuous control, and simulated robotics. The results show that curricula constructed using SEBN frequently outperform other baselines.
Problem

Research questions and friction points this paper is trying to address.

Automating curriculum generation for reinforcement learning.
Modeling skill-environment relationships using Bayesian networks.
Improving training efficiency and performance in diverse tasks.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Skill-Based Bayesian Network
Predict policy performance
Algorithm for task weighting
🔎 Similar Papers
No similar papers found.
V
Vincent Hsiao
Naval Research Laboratory, Washington DC, United States
Mark Roberts
Mark Roberts
Naval Research Laboratory, Washington DC, United States
Laura M. Hiatt
Laura M. Hiatt
Naval Research Laboratory
Artificial IntelligenceRoboticsHuman-Robot Interaction
G
G. Konidaris
Brown University, Providence RI, United States
D
Dana S. Nau
University of Maryland, College Park, MD, United States