CCDP: Composition of Conditional Diffusion Policies with Guided Sampling

📅 2025-03-19

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

This work addresses the inefficiency in imitation learning inference caused by repeated sampling of failed actions. We propose a recovery-oriented action generation framework based on conditional diffusion models. Methodologically, we pioneer the decomposition of long-horizon failure recovery into composable sub-policies and introduce a guided resampling mechanism that dynamically corrects the sampling distribution solely using successful demonstration data—without requiring additional exploration or high-level controllers. Our core contributions are: (1) policy decomposition modeling via diffusion models, and (2) distribution-guided resampling grounded in successful trajectories. Evaluated on tasks including door opening in unknown directions, object manipulation, and button searching, our approach significantly improves action success rates and execution efficiency while demonstrating strong robustness to varying numbers of prior failures.

Technology Category

Application Category

📝 Abstract

Imitation Learning offers a promising approach to learn directly from data without requiring explicit models, simulations, or detailed task definitions. During inference, actions are sampled from the learned distribution and executed on the robot. However, sampled actions may fail for various reasons, and simply repeating the sampling step until a successful action is obtained can be inefficient. In this work, we propose an enhanced sampling strategy that refines the sampling distribution to avoid previously unsuccessful actions. We demonstrate that by solely utilizing data from successful demonstrations, our method can infer recovery actions without the need for additional exploratory behavior or a high-level controller. Furthermore, we leverage the concept of diffusion model decomposition to break down the primary problem (which may require long-horizon history to manage failures) into multiple smaller, more manageable sub-problems in learning, data collection, and inference, thereby enabling the system to adapt to variable failure counts. Our approach yields a low-level controller that dynamically adjusts its sampling space to improve efficiency when prior samples fall short. We validate our method across several tasks, including door opening with unknown directions, object manipulation, and button-searching scenarios, demonstrating that our approach outperforms traditional baselines.

Problem

Research questions and friction points this paper is trying to address.

Enhances sampling strategy to avoid unsuccessful actions

Infers recovery actions using successful demonstration data

Decomposes primary problem into manageable sub-problems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Enhanced sampling strategy refines distribution

Diffusion model decomposition simplifies problem-solving

Dynamic sampling space adjustment improves efficiency

🔎 Similar Papers

Diffusion Models: A Comprehensive Survey of Methods and Applications