Leveraging Temporally Extended Behavior Sharing for Multi-task Reinforcement Learning

📅 2025-09-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Multi-task reinforcement learning (MTRL) in robotics suffers from low sample efficiency and limited cross-task knowledge transfer, primarily due to the high cost of collecting diverse task-specific data. To address this, we propose MT-Lévy—a novel MTRL framework that jointly leverages cross-task behavioral policy reuse and Lévy flight–inspired long-range temporal exploration. Crucially, MT-Lévy introduces an adaptive exploration intensity mechanism, dynamically modulated by task success rates, to jointly optimize exploration breadth and policy transfer depth. Evaluated on multiple robotic control benchmarks, MT-Lévy achieves substantial improvements: +37.2% average gain in sample efficiency and +22.8% increase in final task success rate. Ablation studies confirm the necessity and complementary roles of its three core components—policy reuse, Lévy exploration, and dynamic exploration modulation—demonstrating their synergistic contribution to robust multi-task learning.

Technology Category

Application Category

📝 Abstract
Multi-task reinforcement learning (MTRL) offers a promising approach to improve sample efficiency and generalization by training agents across multiple tasks, enabling knowledge sharing between them. However, applying MTRL to robotics remains challenging due to the high cost of collecting diverse task data. To address this, we propose MT-Lévy, a novel exploration strategy that enhances sample efficiency in MTRL environments by combining behavior sharing across tasks with temporally extended exploration inspired by Lévy flight. MT-Lévy leverages policies trained on related tasks to guide exploration towards key states, while dynamically adjusting exploration levels based on task success ratios. This approach enables more efficient state-space coverage, even in complex robotics environments. Empirical results demonstrate that MT-Lévy significantly improves exploration and sample efficiency, supported by quantitative and qualitative analyses. Ablation studies further highlight the contribution of each component, showing that combining behavior sharing with adaptive exploration strategies can significantly improve the practicality of MTRL in robotics applications.
Problem

Research questions and friction points this paper is trying to address.

Improving sample efficiency in multi-task reinforcement learning
Addressing high data collection costs in robotics MTRL
Enhancing exploration strategies through behavior sharing across tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combining behavior sharing with Lu00e9vy flight exploration
Using trained policies to guide exploration towards key states
Dynamically adjusting exploration levels based on success ratios
🔎 Similar Papers
No similar papers found.