Differentiable Normative Guidance for Nash Bargaining Solution Recovery

📅 2026-03-31

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

Existing negotiation systems struggle to simultaneously satisfy individual rationality (IR) and approximate the Nash bargaining solution (NBS) when the Pareto frontier is unknown. This work proposes a guided graph diffusion framework that, during inference, generates utility allocations fulfilling IR while efficiently approaching NBS without requiring prior knowledge of the Pareto frontier. The method models inter-agent relationships via a directed graph and employs a graph attention mechanism to capture attribute asymmetries. It uniquely integrates a differentiable composite guidance loss into a conditional diffusion model, jointly constraining IR violations and Nash product gaps at the final reverse diffusion step. Theoretically, with sufficient penalty weights, the solution enters the IR region within finitely many steps. Experiments show 100% IR compliance across datasets and Nash efficiencies of 99.45%, 54.24%, and 88.67% on synthetic data, CaSiNo, and Deal or No Deal, respectively, substantially outperforming unconstrained baselines.

Technology Category

Application Category

📝 Abstract

Autonomous artificial intelligence agents in negotiation systems must generate equitable utility allocations satisfying individual rationality (IR), ensuring each agent receives at least its outside option, and the Nash Bargaining Solution (NBS), which maximizes joint surplus. Existing generative models often learn suboptimal human behaviors, producing solutions far from Pareto efficiency, while classical methods require full Pareto frontier knowledge, which is unavailable in real datasets. We propose a guided graph diffusion framework that generates individually rational utility vectors while approximating the NBS without frontier knowledge at inference time. Negotiations are modeled as directed graphs with graph attention capturing asymmetric agent attributes, and a conditional diffusion model maps these to utility vectors. A differentiable composite guidance loss, applied in the final reverse diffusion steps, penalizes IR violations and Nash product gaps. We prove that, under sufficient penalty weighting, solutions enter the IR region in finite time. Across datasets, the method achieves 100% IR compliance. Nash efficiency reaches 99.45% on synthetic data (within 0.55 percentage points of an oracle), and 54.24% (CaSiNo) and 88.67% (Deal or No Deal), improving 20-60 percentage points over unconstrained generative baselines.

Problem

Research questions and friction points this paper is trying to address.

Nash Bargaining Solution

Individual Rationality

Pareto Efficiency

Utility Allocation

Negotiation Systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

differentiable guidance

graph diffusion

Nash Bargaining Solution