Human-Robot Copilot for Data-Efficient Imitation Learning

📅 2026-04-04

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

This work addresses the performance degradation of imitation learning under limited demonstration data, often caused by error compounding or environmental stochasticity. The authors propose a human-in-the-loop co-pilot framework that integrates a Human-Gated DAgger paradigm with a scalable teleoperation interface, enabling efficient expansion of high-quality demonstrations through only intermittent human intervention. By harmonizing fine-grained control with platform generality, the approach overcomes the longstanding trade-off between correction accuracy and robotic universality inherent in existing methods. Experimental results demonstrate that, given an equivalent number of expert trajectories, the proposed framework significantly improves policy performance while reducing the frequency of human interventions, thereby enhancing both the efficiency and usability of demonstration collection.

Technology Category

Application Category

📝 Abstract

Collecting human demonstrations via teleoperation is a common approach for teaching robots task-specific skills. However, when only a limited number of demonstrations are available, policies are prone to entering out-of-distribution (OOD) states due to compounding errors or environmental stochasticity. Existing interactive imitation learning or human-in-the-loop methods try to address this issue by following the Human-Gated DAgger (HG-DAgger) paradigm, an approach that augments demonstrations through selective human intervention during policy execution. Nevertheless, these approaches struggle to balance dexterity and generality: they either provide fine-grained corrections but are limited to specific kinematic structures, or achieve generality at the cost of precise control. To overcome this limitation, we propose the Human-Robot Copilot framework that can leverage a scaling factor for dexterous teleoperation while maintaining compatibility with a wide range of industrial and research manipulators. Experimental results demonstrate that our framework achieves higher performance with the same number of demonstration trajectories. Moreover, since corrective interventions are required only intermittently, the overall data collection process is more efficient and less time-consuming.

Problem

Research questions and friction points this paper is trying to address.

Imitation Learning

Out-of-Distribution

Human-in-the-Loop

Teleoperation

Data Efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Human-Robot Copilot

data-efficient imitation learning

dexterous teleoperation