Multi-Stakeholder LLM Alignment: Decomposing Estimation from Aggregation

📅 2026-05-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of aligning large language models with multiple stakeholders whose preferences may conflict. Existing monolithic evaluation approaches conflate utility estimation with aggregation, leading to unstable weighting schemes. To resolve this, the paper proposes a novel decoupling framework that separates these two components: fixed weights are predefined based on query structure, while counterfactual calibration and role-specific utility estimation enable independent assessment of each stakeholder’s utility, thereby eliminating weight drift caused by dependence on candidate outputs. The authors introduce the concept of “weight noise,” provide theoretical analysis of its impact, and design a decoupling mechanism to enhance alignment stability. Experimental results demonstrate that the proposed method significantly reduces scoring variance and maintains robust performance even as the number of stakeholders increases.
📝 Abstract
Multi-stakeholder tasks require one output to satisfy users with conflicting preferences. Holistic LLM judges conflate utility estimation and utility aggregation, yielding unstable implicit weights. We show empirically and theoretically that this aggregation-specific \emph{weighting noise} can create large score shifts when stakeholder satisfaction is dispersed; in our experiments, these weight-induced shifts also increase with stakeholder count. We propose \textsc{DecompR}: counterfactual-calibrated weights are fixed from query structure before candidate scoring, while per-role utilities are estimated independently, removing candidate-dependent weight drift and reducing estimation noise.
Problem

Research questions and friction points this paper is trying to address.

Multi-Stakeholder Alignment
LLM
Utility Aggregation
Weighting Noise
Preference Conflict
Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-stakeholder alignment
utility decomposition
weighting noise
counterfactual calibration
LLM alignment