Hybrid-Diffusion Models: Combining Open-loop Routines with Visuomotor Diffusion Policies

📅 2025-12-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the limited precision and slow response time of vision–motor imitation learning in complex manipulation tasks, this paper proposes Hybrid-Diffusion: a novel framework that tightly integrates learnable open-loop control routines with a vision–motor diffusion policy, and introduces teleoperation-augmented primitives (TAPs) to enable seamless embedding of action primitives from demonstrations and autonomous triggering during inference. The framework models end-to-end vision–motor policies via diffusion, while TAPs enrich the representation of action sequences; training is performed via imitation learning. Evaluated on real-world pipetting, lid-opening liquid transfer, and container unscrewing tasks, Hybrid-Diffusion achieves significant improvements—+23.5% in task success rate and 1.8× average speedup—demonstrating high precision, rapid response, and strong cross-task generalization.

Technology Category

Application Category

📝 Abstract
Despite the fact that visuomotor-based policies obtained via imitation learning demonstrate good performances in complex manipulation tasks, they usually struggle to achieve the same accuracy and speed as traditional control based methods. In this work, we introduce Hybrid-Diffusion models that combine open-loop routines with visuomotor diffusion policies. We develop Teleoperation Augmentation Primitives (TAPs) that allow the operator to perform predefined routines, such as locking specific axes, moving to perching waypoints, or triggering task-specific routines seamlessly during demonstrations. Our Hybrid-Diffusion method learns to trigger such TAPs during inference. We validate the method on challenging real-world tasks: Vial Aspiration, Open-Container Liquid Transfer, and container unscrewing. All experimental videos are available on the project's website: https://hybriddiffusion.github.io/
Problem

Research questions and friction points this paper is trying to address.

Combines open-loop routines with visuomotor diffusion policies
Improves accuracy and speed in complex manipulation tasks
Validates on real-world tasks like vial aspiration and liquid transfer
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combining open-loop routines with visuomotor diffusion policies
Developing Teleoperation Augmentation Primitives for predefined routines
Learning to trigger TAPs during inference for real-world tasks
🔎 Similar Papers
No similar papers found.
J
Jonne Van Haastregt
INCAR Robotics AB, Sweden
B
Bastian Orthmann
INCAR Robotics AB, Sweden
Michael C. Welle
Michael C. Welle
Postdoctoral researcher, KTH Royal Institute of Technology
Machine LearningRobotics
Y
Yuchong Zhang
KTH Royal Institute of Technology, Sweden
Danica Kragic
Danica Kragic
Professor of Computer Science, KTH - Royal Institute of Technology
roboticsAIrobot visionrobot learning