Divide and Conquer: Provably Unveiling the Pareto Front with Multi-Objective Reinforcement Learning

📅 Unknown Date

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This work addresses the challenge of efficiently and provably computing the complete Pareto-optimal policy set in multi-objective reinforcement learning. We propose Iterative Pareto Reference Optimization (IPRO), a novel framework that decomposes Pareto front approximation into a sequence of constrained single-objective optimization subproblems. IPRO is the first method of its kind to provide theoretical convergence guarantees and, at each iteration, delivers an upper bound on the distance to undiscovered Pareto-optimal solutions. It requires no prior preference information or convexity assumptions and is compatible with arbitrary single-objective solvers. Empirical evaluation on standard benchmarks demonstrates that IPRO achieves state-of-the-art performance in both hypervolume and utility-based metrics. Moreover, we validate its generalizability beyond RL—e.g., to path planning—confirming robustness across diverse multi-objective optimization tasks.

Technology Category

Application Category

📝 Abstract

An important challenge in multi-objective reinforcement learning is obtaining a Pareto front of policies to attain optimal performance under different preferences. We introduce Iterated Pareto Referent Optimisation (IPRO), which decomposes finding the Pareto front into a sequence of constrained single-objective problems. This enables us to guarantee convergence while providing an upper bound on the distance to undiscovered Pareto optimal solutions at each step. We evaluate IPRO using utility-based metrics and its hypervolume and find that it matches or outperforms methods that require additional assumptions. By leveraging problem-specific single-objective solvers, our approach also holds promise for applications beyond multi-objective reinforcement learning, such as planning and pathfinding.

Problem

Research questions and friction points this paper is trying to address.

Multi-objective Reinforcement Learning

Pareto Front

Optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-objective Reinforcement Learning

Pareto Front

Efficient Learning

🔎 Similar Papers

No similar papers found.