Data-Efficient Multitask DAgger

📅 2025-09-29

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

General-purpose robotic policies typically require large-scale expert demonstrations or extensive simulation training; however, existing approaches suffer from low data efficiency and struggle to achieve high success rates in multi-task generalization from limited demonstrations. Method: We propose a performance-aware multi-task policy distillation framework that unifies task-specific expert policies into a single generalist policy. A Kalman-filter-based learning gain estimator dynamically allocates scarce expert demonstrations to maximize data efficiency. The framework integrates DAgger, behavioral cloning, and multi-task learning to enable cross-task knowledge transfer. Contribution/Results: Evaluated on MetaWorld and IsaacLab drawer-opening tasks, our method achieves significantly higher zero-shot transfer success rates on real robots compared to baselines, while reducing required expert demonstrations by 40–60%. It demonstrates superior data efficiency and scalable generalization across diverse manipulation tasks.

Technology Category

Application Category

📝 Abstract

Generalist robot policies that can perform many tasks typically require extensive expert data or simulations for training. In this work, we propose a novel Data-Efficient multitask DAgger framework that distills a single multitask policy from multiple task-specific expert policies. Our approach significantly increases the overall task success rate by actively focusing on tasks where the multitask policy underperforms. The core of our method is a performance-aware scheduling strategy that tracks how much each task's learning process benefits from the amount of data, using a Kalman filter-based estimator to robustly decide how to allocate additional demonstrations across tasks. We validate our approach on MetaWorld, as well as a suite of diverse drawer-opening tasks in IsaacLab. The resulting policy attains high performance across all tasks while using substantially fewer expert demonstrations, and the visual policy learned with our method in simulation shows better performance than naive DAgger and Behavior Cloning when transferring zero-shot to a real robot without using real data.

Problem

Research questions and friction points this paper is trying to address.

Developing efficient multitask robot policies with minimal expert data

Improving task success by focusing on underperforming areas

Reducing demonstration requirements while maintaining high performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Data-Efficient multitask DAgger framework distills single policy

Performance-aware scheduling strategy allocates demonstrations across tasks

Kalman filter-based estimator robustly tracks task learning benefits

🔎 Similar Papers

No similar papers found.

Toyota Research Institute

Los Altos, CA / Cambridge, MA

Research Scientist Intern, Robotic Control Policy (PhD)