Achieving Equilibrium under Utility Heterogeneity: An Agent-Attention Framework for Multi-Agent Multi-Objective Reinforcement Learning

📅 2025-11-12

📈 Citations: 0

✨ Influential: 0

career value

240K/year

🤖 AI Summary

In multi-agent multi-objective systems (MAMOS), heterogeneous utility functions induce non-stationary learning dynamics and hinder convergence to Bayesian Nash equilibria (BNE). Method: We propose the first theoretically grounded framework for BNE approximation under decentralized execution, establishing that global utility awareness is a necessary condition for BNE—thereby designing a centralized-training-with-decentralized-execution paradigm. Our approach implicitly models joint beliefs over other agents’ utilities and policies via attention-based mechanisms (Agent-Attention), integrates multi-agent deep reinforcement learning, and explicitly models the utility space to optimize local policies from partial observations. Contribution/Results: Evaluated on MAMO Particle and MOMALand benchmarks, our method significantly outperforms state-of-the-art approaches, demonstrating superior effectiveness, robustness to utility heterogeneity, and theoretical rigor in approximating BNE under decentralized execution.

Technology Category

Application Category

📝 Abstract

Multi-agent multi-objective systems (MAMOS) have emerged as powerful frameworks for modelling complex decision-making problems across various real-world domains, such as robotic exploration, autonomous traffic management, and sensor network optimisation. MAMOS offers enhanced scalability and robustness through decentralised control and more accurately reflects inherent trade-offs between conflicting objectives. In MAMOS, each agent uses utility functions that map return vectors to scalar values. Existing MAMOS optimisation methods face challenges in handling heterogeneous objective and utility function settings, where training non-stationarity is intensified due to private utility functions and the associated policies. In this paper, we first theoretically prove that direct access to, or structured modeling of, global utility functions is necessary for the Bayesian Nash Equilibrium under decentralised execution constraints. To access the global utility functions while preserving the decentralised execution, we propose an Agent-Attention Multi-Agent Multi-Objective Reinforcement Learning (AA-MAMORL) framework. Our approach implicitly learns a joint belief over other agents'utility functions and their associated policies during centralised training, effectively mapping global states and utilities to each agent's policy. In execution, each agent independently selects actions based on local observations and its private utility function to approximate a BNE, without relying on inter-agent communication. We conduct comprehensive experiments in both a custom-designed MAMO Particle environment and the standard MOMALand benchmark. The results demonstrate that access to global preferences and our proposed AA-MAMORL significantly improve performance and consistently outperform state-of-the-art methods.

Problem

Research questions and friction points this paper is trying to address.

Handling heterogeneous objectives and utility functions in multi-agent systems

Addressing training non-stationarity caused by private utility functions

Achieving Bayesian Nash Equilibrium under decentralized execution constraints

Innovation

Methods, ideas, or system contributions that make the work stand out.

Agent-Attention framework implicitly learns joint utility beliefs

Decentralized execution uses local observations without communication

Mapping global states to individual policies approximates Nash Equilibrium

🔎 Similar Papers

No similar papers found.