SADP: Subgoal-Aware Diffusion Policy for Explainable Robots Learned from Foundation Model Generated Demonstrations

📅 2026-05-16

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

This work addresses the lack of explicit subgoal modeling in existing imitation learning approaches, which renders robot decision-making opaque in long-horizon tasks. The authors propose a subgoal-aware diffusion policy that leverages foundation models to automatically generate demonstration data annotated with subgoals. The policy is trained to condition action generation on both task and subgoal descriptions, while a lightweight auxiliary head predicts subgoal completion status. By embedding subgoal supervision—derived from foundation models—directly into policy training, this method achieves intrinsic interpretability rather than relying on post-hoc explanations. Experiments on RLBench simulations and a real UR5e robot demonstrate that the approach maintains high task success rates while providing real-time subgoal-level execution signals, effectively enabling progress monitoring and fault diagnosis.

📝 Abstract

Explainable robots require not only successful task execution but also the ability to expose internal decision-making process in a user-friendly manner. However, most imitation learning methods are trained solely on task-level demonstrations, without explicitly modeling subgoal structure or execution progress. This limitation is further exacerbated by the scarcity of subgoal-level supervision in standard robot learning datasets, which restricts the development of robots that can convey the subtasks they are executing during long-horizon manipulation. To address this issue, this paper proposes Subgoal-Aware Diffusion Policy (SADP), a framework that leverages foundation models to autonomously generate subgoal-annotated demonstrations and trains diffusion policies on these datasets. SADP structures policy execution around human-interpretable subgoals by conditioning action generation on both task-level and subgoal-level descriptions. A lightweight auxiliary head further predicts subgoal completion states, allowing the robot to expose its current execution stage and monitor subgoal progression. Experiments in RLBench simulations and real-world evaluations on a UR5e robot demonstrate that SADP achieves higher task success rates than strong task-conditioned diffusion baselines, while providing subgoal-level execution signals for monitoring progress and diagnosing failures. These results highlight that built-in, rather than post-hoc, interpretability can coexist with high task performance.

Problem

Research questions and friction points this paper is trying to address.

explainable robots

subgoal-awareness

imitation learning

long-horizon manipulation

interpretability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Subgoal-Aware

Diffusion Policy

Foundation Model