Learning from Demonstrations via Capability-Aware Goal Sampling

📅 2026-01-13

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

This work addresses the limitations of imitation learning in long-horizon tasks, where performance degrades due to error accumulation and excessive reliance on expert demonstrations. To overcome these challenges, the authors propose Cago, a novel method that integrates agent capability awareness into the goal-sampling process. Cago dynamically evaluates the agent’s current competence and adaptively selects intermediate goals that are slightly beyond its current ability, thereby constructing a progressive learning curriculum. By combining goal-conditioned reinforcement learning with imitation learning, Cago enables adaptive curriculum training without requiring perfect expert demonstrations, significantly reducing dependence on expert trajectories. Empirical results across multiple sparse-reward, goal-directed tasks demonstrate that Cago consistently outperforms existing imitation learning baselines in both sample efficiency and final performance.

Technology Category

Application Category

📝 Abstract

Despite its promise, imitation learning often fails in long-horizon environments where perfect replication of demonstrations is unrealistic and small errors can accumulate catastrophically. We introduce Cago (Capability-Aware Goal Sampling), a novel learning-from-demonstrations method that mitigates the brittle dependence on expert trajectories for direct imitation. Unlike prior methods that rely on demonstrations only for policy initialization or reward shaping, Cago dynamically tracks the agent's competence along expert trajectories and uses this signal to select intermediate steps--goals that are just beyond the agent's current reach--to guide learning. This results in an adaptive curriculum that enables steady progress toward solving the full task. Empirical results demonstrate that Cago significantly improves sample efficiency and final performance across a range of sparse-reward, goal-conditioned tasks, consistently outperforming existing learning from-demonstrations baselines.

Problem

Research questions and friction points this paper is trying to address.

imitation learning

long-horizon tasks

demonstration dependence

error accumulation

sparse-reward environments

Innovation

Methods, ideas, or system contributions that make the work stand out.

imitation learning

goal-conditioned reinforcement learning

curriculum learning