Active Multi-task Policy Fine-tuning

📅 2024-10-07

📈 Citations: 0

✨ Influential: 0

career value

230K/year

🤖 AI Summary

To address the scarcity of demonstration data in multi-task robotic policy fine-tuning, this paper proposes the first active learning framework specifically designed for multi-task policy adaptation. The method dynamically selects the most discriminative tasks for demonstration collection using an information-gain criterion and jointly optimizes task selection and policy update. Built upon behavioral cloning and neural policy fine-tuning, it incorporates a regularity assumption to establish a theoretical lower bound on performance. Experiments in high-dimensional, complex environments demonstrate that our approach significantly outperforms baselines—including random sampling and round-robin selection—under identical demonstration budgets: multi-task average performance improves by 18.7% (±2.3%). These results validate the framework’s superior data efficiency and generalization capability.

Technology Category

Application Category

📝 Abstract

Pre-trained generalist policies are rapidly gaining relevance in robot learning due to their promise of fast adaptation to novel, in-domain tasks. This adaptation often relies on collecting new demonstrations for a specific task of interest and applying imitation learning algorithms, such as behavioral cloning. However, as soon as several tasks need to be learned, we must decide which tasks should be demonstrated and how often? We study this multi-task problem and explore an interactive framework in which the agent adaptively selects the tasks to be demonstrated. We propose AMF (Active Multi-task Fine-tuning), an algorithm to maximize multi-task policy performance under a limited demonstration budget by collecting demonstrations yielding the largest information gain on the expert policy. We derive performance guarantees for AMF under regularity assumptions and demonstrate its empirical effectiveness to efficiently fine-tune neural policies in complex and high-dimensional environments.

Problem

Research questions and friction points this paper is trying to address.

Adapting pre-trained policies to multiple novel tasks efficiently

Optimizing task demonstration selection under limited budget

Maximizing multi-task policy performance via active fine-tuning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Active Multi-task Fine-tuning maximizes performance

Adaptive task selection for demonstrations

Information gain optimizes demonstration efficiency

🔎 Similar Papers

No similar papers found.