Achieving Collective Welfare in Multi-Agent Reinforcement Learning via Suggestion Sharing

📅 2024-12-16

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

245K/year

🤖 AI Summary

In multi-agent reinforcement learning (MARL), a fundamental tension exists between individual self-interest and collective welfare. To address this, we propose Suggestion Sharing (SS), a novel mechanism wherein agents exchange only action suggestions—without sharing rewards, value functions, policy parameters, or sensitive state information. SS establishes the first “pure suggestion exchange” paradigm, and we theoretically prove that it effectively reduces the objective gap between individual and collective optima. Empirically, SS matches or exceeds the performance of mainstream baselines that share rewards, values, or policies on canonical social dilemma benchmarks, while substantially mitigating information leakage risks. It thus achieves a favorable trade-off between cooperative efficiency and privacy preservation. Our core contribution is a lightweight, decentralized, and privacy-preserving pathway to collective optimization, offering a principled alternative to conventional centralized or information-intensive coordination mechanisms.

Technology Category

Application Category

📝 Abstract

In human society, the conflict between self-interest and collective well-being often obstructs efforts to achieve shared welfare. Related concepts like the Tragedy of the Commons and Social Dilemmas frequently manifest in our daily lives. As artificial agents increasingly serve as autonomous proxies for humans, we propose using multi-agent reinforcement learning (MARL) to address this issue - learning policies to maximise collective returns even when individual agents' interests conflict with the collective one. Traditional MARL solutions involve sharing rewards, values, and policies or designing intrinsic rewards to encourage agents to learn collectively optimal policies. We introduce a novel MARL approach based on Suggestion Sharing (SS), where agents exchange only action suggestions. This method enables effective cooperation without the need to design intrinsic rewards, achieving strong performance while revealing less private information compared to sharing rewards, values, or policies. Our theoretical analysis establishes a bound on the discrepancy between collective and individual objectives, demonstrating how sharing suggestions can align agents' behaviours with the collective objective. Experimental results demonstrate that SS performs competitively with baselines that rely on value or policy sharing or intrinsic rewards.

Problem

Research questions and friction points this paper is trying to address.

Resolving self-interest vs collective welfare conflict in MARL

Enabling cooperation without sharing rewards or policies

Aligning agent behaviors with collective objectives via suggestions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Agents exchange action suggestions for cooperation

Minimizes private information disclosure effectively

Aligns individual behaviors with collective objectives

🔎 Similar Papers

Reciprocal Reward Influence Encourages Cooperation From Self-Interested Agents

2024-06-03arXiv.orgCitations: 0