Annotation-Free One-Shot Imitation Learning for Multi-Step Manipulation Tasks

📅 2025-09-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of enabling robots to learn multi-step manipulation tasks from a single human demonstration—without additional model training or manual annotation. We propose an end-to-end one-shot imitation learning framework that employs a lightweight vision-to-action mapping network built upon a pre-trained visual encoder. To enhance cross-task generalization, we incorporate contrastive learning and design the architecture to support plug-and-play integration of diverse backbone models. Our key contribution is the first demonstration of high-performance, fine-tuning-free, annotation-free one-shot imitation for long-horizon multi-step tasks—overcoming dual bottlenecks in task length scalability and deployment efficiency inherent in prior methods. Experiments show average success rates of 82.5% on multi-step tasks and 90% on single-step tasks, substantially outperforming baselines while maintaining computational efficiency and architectural extensibility.

Technology Category

Application Category

📝 Abstract
Recent advances in one-shot imitation learning have enabled robots to acquire new manipulation skills from a single human demonstration. While existing methods achieve strong performance on single-step tasks, they remain limited in their ability to handle long-horizon, multi-step tasks without additional model training or manual annotation. We propose a method that can be applied to this setting provided a single demonstration without additional model training or manual annotation. We evaluated our method on multi-step and single-step manipulation tasks where our method achieves an average success rate of 82.5% and 90%, respectively. Our method matches and exceeds the performance of the baselines in both these cases. We also compare the performance and computational efficiency of alternative pre-trained feature extractors within our framework.
Problem

Research questions and friction points this paper is trying to address.

Enables robots to learn multi-step tasks without annotations
Eliminates need for additional model training or manual labeling
Achieves high success rates on complex manipulation tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Annotation-free one-shot imitation learning for manipulation
Handles multi-step tasks without additional training
Uses pre-trained feature extractors for efficiency
🔎 Similar Papers
No similar papers found.
V
Vijja Wichitwechkarn
Dept. of Engineering, University of Cambridge, Cambridge CB2 1PZ
E
Emlyn Williams
School of Computer Science, University of Lincoln, Lincoln LN6 7TS
C
Charles Fox
School of Computer Science, University of Lincoln, Lincoln LN6 7TS
Ruchi Choudhary
Ruchi Choudhary
Professor, Engineering Department, University of Cambridge
building simulationuncertainty quantificationurban energy analysisbuilding physics