Decision-Making Under Complete Uncertainty: You Will Regret Not Being Greedy

📅 2025-02-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper studies worst-case regret in multi-armed bandits under Knightian uncertainty. We consider a strictly uncertain environment where the decision maker observes only finitely many samples and faces an adversarial nature. To formalize this, we construct a probability game model and employ minimax analysis combined with asymptotic theory. We rigorously prove that the greedy policy—selecting the arm with the highest empirical mean—is globally optimal in the worst-case regret sense, and its regret rate converges to zero at rate $O(1/sqrt{n})$ as the sample size $n$ grows. This constitutes the first theoretical demonstration of the greedy policy’s optimality within a purely non-probabilistic uncertainty framework. Empirical evaluation on Google restaurant review data shows that the greedy policy significantly outperforms uniform sampling and Thompson sampling, confirming both its theoretical soundness and practical efficacy.

Technology Category

Application Category

📝 Abstract
In this paper, we propose a probabilistic game-theoretic model to study the properties of the worst-case regret of the greedy strategy under complete (Knightian) uncertainty. In a game between a decision-maker (DM) and an adversarial agent (Nature), the DM observes a realization of product ratings for each product. Upon observation, the DM chooses a strategy, which is a function from the set of observations to the set of products. We study the theoretical properties, including the worst-case regret of the greedy strategy that chooses the product with the highest observed average rating. We prove that, with respect to the worst-case regret, the greedy strategy is optimal and that, in the limit, the regret of the greedy strategy converges to zero. We validate the model on data collected from Google reviews for restaurants, showing that the greedy strategy not only performs according to the theoretical findings but also outperforms the uniform strategy and the Thompson Sampling algorithm.
Problem

Research questions and friction points this paper is trying to address.

Analyzes worst-case regret of greedy strategy
Focuses on decision-making under complete uncertainty
Validates model using Google restaurant reviews data
Innovation

Methods, ideas, or system contributions that make the work stand out.

probabilistic game-theoretic model
worst-case regret analysis
greedy strategy optimization
🔎 Similar Papers
No similar papers found.
K
Kristijan Atanasov
Department of Informatics, King’s College London
M
Mehmet Ismail
Department of Political Economy, King’s College London
Frederik Mallmann-Trenn
Frederik Mallmann-Trenn
King's College London
Randomized processesdata sciencedistributed computingcommunity detectioncomputing with noise