Human-Robot Copilot for Data-Efficient Imitation Learning

📅 2026-04-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the performance degradation of imitation learning under limited demonstration data, often caused by error compounding or environmental stochasticity. The authors propose a human-in-the-loop co-pilot framework that integrates a Human-Gated DAgger paradigm with a scalable teleoperation interface, enabling efficient expansion of high-quality demonstrations through only intermittent human intervention. By harmonizing fine-grained control with platform generality, the approach overcomes the longstanding trade-off between correction accuracy and robotic universality inherent in existing methods. Experimental results demonstrate that, given an equivalent number of expert trajectories, the proposed framework significantly improves policy performance while reducing the frequency of human interventions, thereby enhancing both the efficiency and usability of demonstration collection.
📝 Abstract
Collecting human demonstrations via teleoperation is a common approach for teaching robots task-specific skills. However, when only a limited number of demonstrations are available, policies are prone to entering out-of-distribution (OOD) states due to compounding errors or environmental stochasticity. Existing interactive imitation learning or human-in-the-loop methods try to address this issue by following the Human-Gated DAgger (HG-DAgger) paradigm, an approach that augments demonstrations through selective human intervention during policy execution. Nevertheless, these approaches struggle to balance dexterity and generality: they either provide fine-grained corrections but are limited to specific kinematic structures, or achieve generality at the cost of precise control. To overcome this limitation, we propose the Human-Robot Copilot framework that can leverage a scaling factor for dexterous teleoperation while maintaining compatibility with a wide range of industrial and research manipulators. Experimental results demonstrate that our framework achieves higher performance with the same number of demonstration trajectories. Moreover, since corrective interventions are required only intermittently, the overall data collection process is more efficient and less time-consuming.
Problem

Research questions and friction points this paper is trying to address.

Imitation Learning
Out-of-Distribution
Human-in-the-Loop
Teleoperation
Data Efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Human-Robot Copilot
data-efficient imitation learning
dexterous teleoperation
out-of-distribution correction
scalable robot control
🔎 Similar Papers
No similar papers found.