Sequential Resource Trading Using Comparison-Based Gradient Estimation

📅 2024-08-20

📈 Citations: 1

✨ Influential: 0

career value

214K/year

🤖 AI Summary

This paper addresses the problem of achieving Pareto-optimal allocation through sequential bargaining between two rational agents with unknown preferences, under constraints on finite resource categories. In settings where only binary accept/reject feedback is available, we propose the first gradient estimation algorithm based on comparative feedback: it leverages a greedy rationality assumption and employs rejection-based pruning of the preference space to guarantee that every accepted proposal constitutes a strict Pareto improvement, with theoretical convergence to an ε-weak Pareto optimum. The method unifies treatment of both continuous and discrete negotiation domains, significantly reducing the number of报价 (offer) rounds while improving social welfare. User studies demonstrate its superior performance in human–AI collaboration scenarios characterized by high resource contention and aligned objectives.

Technology Category

Application Category

📝 Abstract

Autonomous agents interact with other autonomous agents and humans of unknown preferences to share resources in their environment. We explore sequential trading for resource allocation in a setting where two greedily rational agents sequentially trade resources from a finite set of categories. Each agent has a utility function that depends on the amount of resources it possesses in each category. The offering agent makes trade offers to improve its utility without knowing the responding agent's utility function, and the responding agent only accepts offers that improve its utility. To facilitate cooperation between an autonomous agent and another autonomous agent or a human, we present an algorithm for the offering agent to estimate the responding agent's gradient (preferences) and make offers based on previous acceptance or rejection responses. The algorithm's goal is to reach a Pareto-optimal resource allocation state while ensuring that the utilities of both agents improve after every accepted trade. The algorithm estimates the responding agent's gradient by leveraging the rejected offers and the greedy rationality assumption, to prune the space of potential gradients. We show that, after the algorithm makes a finite number of rejected offers, the algorithm either finds a mutually beneficial trade or certifies that the current state is epsilon-weakly Pareto optimal. We compare the proposed algorithm against various baselines in continuous and discrete trading scenarios and show that it improves the societal benefit with fewer offers. Additionally, we validate these findings in a user study with human participants, where the algorithm achieves high performance in scenarios with high resource conflict due to aligned agent goals.

Problem

Research questions and friction points this paper is trying to address.

Sequential resource trading between autonomous agents with unknown preferences

Estimating agent preferences via gradient-based algorithm for Pareto-optimal allocation

Improving societal benefit with fewer offers in competitive resource scenarios

Innovation

Methods, ideas, or system contributions that make the work stand out.

Estimates agent preferences via gradient estimation

Ensures Pareto-optimality through iterative trade offers

Leverages rejected offers to prune gradient space

🔎 Similar Papers

Large-Scale Contextual Market Equilibrium Computation through Deep Learning