Hybrid-Diffusion Models: Combining Open-loop Routines with Visuomotor Diffusion Policies

📅 2025-12-04

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

To address the limited precision and slow response time of vision–motor imitation learning in complex manipulation tasks, this paper proposes Hybrid-Diffusion: a novel framework that tightly integrates learnable open-loop control routines with a vision–motor diffusion policy, and introduces teleoperation-augmented primitives (TAPs) to enable seamless embedding of action primitives from demonstrations and autonomous triggering during inference. The framework models end-to-end vision–motor policies via diffusion, while TAPs enrich the representation of action sequences; training is performed via imitation learning. Evaluated on real-world pipetting, lid-opening liquid transfer, and container unscrewing tasks, Hybrid-Diffusion achieves significant improvements—+23.5% in task success rate and 1.8× average speedup—demonstrating high precision, rapid response, and strong cross-task generalization.

Technology Category

Application Category

📝 Abstract

Despite the fact that visuomotor-based policies obtained via imitation learning demonstrate good performances in complex manipulation tasks, they usually struggle to achieve the same accuracy and speed as traditional control based methods. In this work, we introduce Hybrid-Diffusion models that combine open-loop routines with visuomotor diffusion policies. We develop Teleoperation Augmentation Primitives (TAPs) that allow the operator to perform predefined routines, such as locking specific axes, moving to perching waypoints, or triggering task-specific routines seamlessly during demonstrations. Our Hybrid-Diffusion method learns to trigger such TAPs during inference. We validate the method on challenging real-world tasks: Vial Aspiration, Open-Container Liquid Transfer, and container unscrewing. All experimental videos are available on the project's website: https://hybriddiffusion.github.io/

Problem

Research questions and friction points this paper is trying to address.

Combines open-loop routines with visuomotor diffusion policies

Improves accuracy and speed in complex manipulation tasks

Validates on real-world tasks like vial aspiration and liquid transfer

Innovation

Methods, ideas, or system contributions that make the work stand out.

Combining open-loop routines with visuomotor diffusion policies

Developing Teleoperation Augmentation Primitives for predefined routines

Learning to trigger TAPs during inference for real-world tasks

🔎 Similar Papers

No similar papers found.