Controllability in preference-conditioned multi-objective reinforcement learning

📅 2026-05-11

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

Existing evaluation metrics in multi-objective reinforcement learning (MORL) struggle to assess how well preference-conditioned agents respond to user intent and lack a quantitative measure of controllability. This work establishes controllability as a critical property of MORL systems and introduces a novel metric specifically designed to evaluate the controllability of preference-conditioned policies, along with a complementary evaluation protocol. The proposed approach reveals that while current agents may perform well on standard benchmarks, they can exhibit insensitivity to preference inputs—a limitation obscured by prevailing evaluation practices. These findings highlight a significant gap in the dominant MORL assessment paradigm and motivate the community to re-examine how agent performance is evaluated in preference-based settings.

📝 Abstract

Multi-objective reinforcement learning (MORL) allows a user to express preference over outcomes in terms of the relative importance of the objectives, but standard metrics cannot capture whether changes in preference reliably change the agent's behavior in the intended way, a property termed controllability. As a result, preference-conditioned agents can score well on standard MORL metrics while being insensitive to the preference input. If the ability to control agents cannot be reliably assessed, the symbolic interface that MORL provides between user intent and agent behavior is broken. Mainstream MORL metrics alone fail to measure the controllability of preference-conditioned agents, motivating a complementary metric specifically designed to that end. We hope the results spur discussion in the community on existing evaluation protocols to consolidate advances in preference adaptation in MORL to larger and more complex problems.

Problem

Research questions and friction points this paper is trying to address.

controllability

preference-conditioned

multi-objective reinforcement learning

evaluation metrics

user intent

Innovation

Methods, ideas, or system contributions that make the work stand out.

controllability

preference-conditioned reinforcement learning

multi-objective reinforcement learning