Socially-Weighted Alignment: A Game-Theoretic Framework for Multi-Agent LLM Systems

📅 2026-02-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the negative externalities arising from individually rational decisions of large language model (LLM) agents in shared environments, which lead to persistent congestion and degraded system performance. To mitigate this, the authors propose the Socially-Weighted Alignment (SWA) framework, which, during inference, interpolates between individual objectives and collective welfare via a social weight parameter λ, thereby encouraging agents to voluntarily suppress demand under overload conditions. SWA represents the first integration of social preference mechanisms from game theory into multi-agent LLM reasoning, inducing a system-level phase transition without requiring parameter updates or reinforcement learning. Theoretical analysis yields a critical threshold λ* = (n − β)/(n − 1), and simulations confirm that when λ exceeds λ*, the system transitions from a congested state to stable operation at capacity, substantially improving overall efficiency.

Technology Category

Application Category

📝 Abstract
Deploying large language model (LLM) agents in shared environments introduces a fundamental tension between individual alignment and collective stability: locally rational decisions can impose negative externalities that degrade system-level performance. We propose Socially-Weighted Alignment (SWA), a game-theoretic framework that modifies inference-time decision making by interpolating between an agent's private objective and an estimate of group welfare via a social weight $\lambda\in[0,1]$. In a shared-resource congestion game with $n$ agents and congestion severity $\beta$, we show that SWA induces a critical threshold $\lambda^*=(n-\beta)/(n-1)$ above which agents no longer have marginal incentive to increase demand under overload, yielding a phase transition from persistent congestion to stable operation near capacity. We further provide an inference-time algorithmic instantiation of SWA that does not require parameter updates or multi-agent reinforcement learning, and use a multi-agent simulation to empirically validate the predicted threshold behavior.
Problem

Research questions and friction points this paper is trying to address.

multi-agent systems
large language models
collective stability
negative externalities
shared environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Socially-Weighted Alignment
game-theoretic framework
multi-agent LLM systems
inference-time alignment
congestion game