Probabilistic Curriculum Learning for Goal-Based Reinforcement Learning

📅 2025-04-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of automated goal generation and adaptive curriculum difficulty adjustment in goal-conditioned reinforcement learning, this paper proposes a probabilistic curriculum learning framework that models goal sampling as a probabilistic inference problem—enabling, for the first time, uncertainty quantification over the goal distribution and dynamic, difficulty-aware adaptation. Methodologically, it integrates variational inference, Bayesian optimization, and soft Actor-Critic to jointly optimize goal selection and policy learning in continuous control (MuJoCo) and visual navigation (AI2-Thor) tasks. Experiments demonstrate substantial improvements: 37% faster curriculum convergence and a 22% increase in zero-shot cross-task transfer success rate. The core contribution is a differentiable, interpretable, and uncertainty-aware goal generation mechanism, establishing a novel probabilistic paradigm for curriculum learning.

Technology Category

Application Category

📝 Abstract
Reinforcement learning (RL) -- algorithms that teach artificial agents to interact with environments by maximising reward signals -- has achieved significant success in recent years. These successes have been facilitated by advances in algorithms (e.g., deep Q-learning, deep deterministic policy gradients, proximal policy optimisation, trust region policy optimisation, and soft actor-critic) and specialised computational resources such as GPUs and TPUs. One promising research direction involves introducing goals to allow multimodal policies, commonly through hierarchical or curriculum reinforcement learning. These methods systematically decompose complex behaviours into simpler sub-tasks, analogous to how humans progressively learn skills (e.g. we learn to run before we walk, or we learn arithmetic before calculus). However, fully automating goal creation remains an open challenge. We present a novel probabilistic curriculum learning algorithm to suggest goals for reinforcement learning agents in continuous control and navigation tasks.
Problem

Research questions and friction points this paper is trying to address.

Automating goal creation for reinforcement learning agents
Enhancing multimodal policies through probabilistic curriculum learning
Improving continuous control and navigation tasks with RL
Innovation

Methods, ideas, or system contributions that make the work stand out.

Probabilistic curriculum learning for RL
Automated goal creation in continuous tasks
Hierarchical decomposition of complex behaviors