AMPED: Adaptive Multi-objective Projection for balancing Exploration and skill Diversification

📅 2025-06-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In sparse-reward skill-based reinforcement learning (SBRL), optimizing for both exploration efficiency and skill diversity leads to conflicting objectives, hindering effective policy learning. Method: This paper proposes an adaptive multi-objective projection framework: during pretraining, gradient surgery enables joint optimization of the two objectives, coupled with a parameter-free adaptive gradient balancing mechanism; during fine-tuning, a task-aware dynamic skill selection module enhances downstream adaptability. Contribution/Results: To our knowledge, this is the first work to explicitly jointly model and optimize both exploration and skill diversity. Evaluated on multiple SBRL benchmarks, our approach significantly outperforms state-of-the-art methods, achieving improved policy generalization and enhanced task transfer capability.

Technology Category

Application Category

📝 Abstract
Skill-based reinforcement learning (SBRL) enables rapid adaptation in environments with sparse rewards by pretraining a skill-conditioned policy. Effective skill learning requires jointly maximizing both exploration and skill diversity. However, existing methods often face challenges in simultaneously optimizing for these two conflicting objectives. In this work, we propose a new method, Adaptive Multi-objective Projection for balancing Exploration and skill Diversification (AMPED), which explicitly addresses both exploration and skill diversification. We begin by conducting extensive ablation studies to identify and define a set of objectives that effectively capture the aspects of exploration and skill diversity, respectively. During the skill pretraining phase, AMPED introduces a gradient surgery technique to balance the objectives of exploration and skill diversity, mitigating conflicts and reducing reliance on heuristic tuning. In the subsequent fine-tuning phase, AMPED incorporates a skill selector module that dynamically selects suitable skills for downstream tasks, based on task-specific performance signals. Our approach achieves performance that surpasses SBRL baselines across various benchmarks. These results highlight the importance of explicitly harmonizing exploration and diversity and demonstrate the effectiveness of AMPED in enabling robust and generalizable skill learning. Project Page: https://geonwoo.me/amped/
Problem

Research questions and friction points this paper is trying to address.

Balancing exploration and skill diversity in reinforcement learning
Mitigating conflicts between exploration and skill diversification objectives
Dynamically selecting skills for downstream task adaptation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Gradient surgery balances exploration and diversity
Dynamic skill selector for task adaptation
Multi-objective optimization for skill learning
🔎 Similar Papers
No similar papers found.
Geonwoo Cho
Geonwoo Cho
Gwangju Institute of Science and Technology
Reinforcement Learning
J
Jaemoon Lee
Seoul National University
J
Jaegyun Im
Gwangju Institute of Science and Technology
S
Subi Lee
Gwangju Institute of Science and Technology
J
Jihwan Lee
Gwangju Institute of Science and Technology
Sundong Kim
Sundong Kim
Assistant Professor, GIST
AGIArtificial IntelligenceMachine LearningData Mining