Learning in Prophet Inequalities with Noisy Observations

📅 2026-04-02

📈 Citations: 0

✨ Influential: 0

career value

229K/year

🤖 AI Summary

This work addresses prophet inequality problems in online decision-making where rewards are only accessible through noisy observations and the true rewards follow a linear model with unknown parameters. The authors propose a threshold policy based on Linear Lower Confidence Bounds (LCB) and introduce variants inspired by Explore-then-Decide and ε-Greedy strategies to effectively integrate learning and decision-making. Under i.i.d. settings, the approach achieves the optimal competitive ratio of $1 - 1/e$ for the first time in this context. For non-identically distributed arrivals, it guarantees a competitive ratio of $1/2$, and even under a finite-history sliding window, it attains a tight $1/2$ approximation relative to the optimal benchmark. These results overcome theoretical limitations of classical prophet inequalities in the presence of observation noise and unknown reward distributions.

Technology Category

Application Category

📝 Abstract

We study the prophet inequality, a fundamental problem in online decision-making and optimal stopping, in a practical setting where rewards are observed only through noisy realizations and reward distributions are unknown. At each stage, the decision-maker receives a noisy reward whose true value follows a linear model with an unknown latent parameter, and observes a feature vector drawn from a distribution. To address this challenge, we propose algorithms that integrate learning and decision-making via lower-confidence-bound (LCB) thresholding. In the i.i.d.\ setting, we establish that both an Explore-then-Decide strategy and an $\varepsilon$-Greedy variant achieve the sharp competitive ratio of $1 - 1/e$, under a mild condition on the optimal value. For non-identical distributions, we show that a competitive ratio of $1/2$ can be guaranteed against a relaxed benchmark. Moreover, with limited window access to past rewards, the tight ratio of $1/2$ against the optimal benchmark is achieved.

Problem

Research questions and friction points this paper is trying to address.

prophet inequalities

noisy observations

online decision-making

optimal stopping

unknown reward distributions

Innovation

Methods, ideas, or system contributions that make the work stand out.

prophet inequality

noisy observations

lower-confidence-bound (LCB)