Efficient Individually Rational Recommender System under Stochastic Order

📅 2025-02-18

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

This paper studies the exploration-exploitation trade-off in multi-agent recommendation systems under individual rationality constraints: each user’s expected utility must be at least as high as that of their default option, while the system maximizes cumulative social utility. For reward distributions satisfying the stochastic order condition (e.g., Bernoulli, unit-variance Gaussian), we propose an approximately optimal algorithm based on an auxiliary-objective Markov decision process (MDP), achieving efficient cooperative exploration under strict individual rationality for the first time. We further design an incentive-compatible variant ensuring agents truthfully report preferences. Theoretically and empirically, our method achieves near-optimal cumulative utility under canonical stochastic orders, strictly satisfies individual rationality, and significantly improves overall system welfare.

Technology Category

Application Category

📝 Abstract

With the rise of online applications, recommender systems (RSs) often encounter constraints in balancing exploration and exploitation. Such constraints arise when exploration is carried out by agents whose individual utility should be balanced with overall welfare. Recent work suggests that recommendations should be individually rational. Specifically, if agents have a default arm they would use, relying on the RS should yield each agent at least the reward of the default arm, conditioned on the knowledge available to the RS. Under this individual rationality constraint, striking a balance between exploration and exploitation becomes a complex planning problem. We assume a stochastic order of the rewards (e.g., Bernoulli, unit-variance Gaussian, etc.), and derive an approximately optimal algorithm. Our technique is based on an auxiliary Goal Markov Decision Process problem that is of independent interest. Additionally, we present an incentive-compatible version of our algorithm.

Problem

Research questions and friction points this paper is trying to address.

Balancing exploration and exploitation in recommender systems

Ensuring individual rationality in agent recommendations

Developing optimal algorithms under stochastic reward orders

Innovation

Methods, ideas, or system contributions that make the work stand out.

Individually rational recommender system

Stochastic order reward assumption

Goal Markov Decision Process technique

🔎 Similar Papers

A Comprehensive Survey on Retrieval Methods in Recommender Systems