Automating Curriculum Learning for Reinforcement Learning using a Skill-Based Bayesian Network

๐Ÿ“… 2025-02-21
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

207K/year
๐Ÿค– AI Summary
Automatic curriculum generation in reinforcement learning faces challenges including slow convergence, poor generalization across tasks, and limited interpretability. To address these, we propose the Skill-Environment Bayesian Network (SEBN), the first probabilistic graphical model that jointly represents skills, reward objectives, and environmental featuresโ€”enabling interpretable curriculum design, cross-task generalization, and principled uncertainty quantification. Building upon SEBN, we develop an expectation-based task selection algorithm that dynamically predicts policy performance on unseen tasks and selects the optimal training sequence to maximize expected performance gain. We evaluate our method across three distinct domains: discrete grid worlds, continuous control benchmarks, and simulated robotic manipulation. Compared to hand-crafted curricula and state-of-the-art baselines, our approach improves training efficiency by 32% on average and enhances final policy performance by 18โ€“27%.

Technology Category

Application Category

๐Ÿ“ Abstract
A major challenge for reinforcement learning is automatically generating curricula to reduce training time or improve performance in some target task. We introduce SEBNs (Skill-Environment Bayesian Networks) which model a probabilistic relationship between a set of skills, a set of goals that relate to the reward structure, and a set of environment features to predict policy performance on (possibly unseen) tasks. We develop an algorithm that uses the inferred estimates of agent success from SEBN to weigh the possible next tasks by expected improvement. We evaluate the benefit of the resulting curriculum on three environments: a discrete gridworld, continuous control, and simulated robotics. The results show that curricula constructed using SEBN frequently outperform other baselines.
Problem

Research questions and friction points this paper is trying to address.

Automating curriculum generation for reinforcement learning.
Modeling skill-environment relationships using Bayesian networks.
Improving training efficiency and performance in diverse tasks.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Skill-Based Bayesian Network
Predict policy performance
Algorithm for task weighting