PPGuide: Steering Diffusion Policies with Performance Predictive Guidance

📅 2026-03-11

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

Diffusion policies in robotic manipulation are prone to task failure due to error accumulation in action sequences. This work proposes a lightweight classifier framework that, during inference, steers pretrained diffusion policies away from failure modes via gradient guidance. Leveraging an attention-based multiple instance learning approach, the method automatically labels observation-action segments as successful or failed in a self-supervised manner and trains a performance predictor to provide real-time gradient feedback. Requiring neither expert demonstrations nor complex world models, the approach achieves consistent and significant performance improvements across multiple tasks in both Robomimic and MimicGen benchmarks.

Technology Category

Application Category

📝 Abstract

Diffusion policies have shown to be very efficient at learning complex, multi-modal behaviors for robotic manipulation. However, errors in generated action sequences can compound over time which can potentially lead to failure. Some approaches mitigate this by augmenting datasets with expert demonstrations or learning predictive world models which might be computationally expensive. We introduce Performance Predictive Guidance (PPGuide), a lightweight, classifier-based framework that steers a pre-trained diffusion policy away from failure modes at inference time. PPGuide makes use of a novel self-supervised process: it uses attention-based multiple instance learning to automatically estimate which observation-action chunks from the policy's rollouts are relevant to success or failure. We then train a performance predictor on this self-labeled data. During inference, this predictor provides a real-time gradient to guide the policy toward more robust actions. We validated our proposed PPGuide across a diverse set of tasks from the Robomimic and MimicGen benchmarks, demonstrating consistent improvements in performance.

Problem

Research questions and friction points this paper is trying to address.

diffusion policies

error compounding

robotic manipulation

failure modes

action sequences

Innovation

Methods, ideas, or system contributions that make the work stand out.

Performance Predictive Guidance

diffusion policies

self-supervised learning