🤖 AI Summary
This paper introduces the “keychain problem,” a sequential decision-making framework where only a subset of actions is available at each stage. The agent, operating under a Bayesian prior, must select actions from dynamically constrained subsets to maximize success probability or minimize opportunity cost—the expected number of rounds in which success was feasible but not achieved. We analyze three keychain ordering models: (i) fixed and known order, (ii) random sampling from a finite scenario set, and (iii) order chosen adaptively by the agent. For each, we provide an exact algorithm, a constant-factor approximation, and hardness results, respectively. Our core contribution is a unifying theoretical framework that establishes the first formal connection between combinatorial auctions and sequential exploration strategies; further, we extend the model to online bipartite matching, yielding a novel “philosophers’ inequality.” Integrating Bayesian inference, combinatorial optimization, and stochastic algorithm analysis, this work introduces a new paradigm for sequential decision-making under action-space constraints.
📝 Abstract
In this paper, we introduce a family of sequential decision-making problems, collectively called the Keychain Problem, that involve exploring a set of actions to maximize expected payoff when only a subset of actions are available in each stage. In an instance of the Keychain Problem, a locksmith faces a sequence of choices, each of which involves selecting one key from a specified subset (a keychain) to attempt to open a lock. Given a Bayesian prior on the effectiveness of keys, the locksmith's goal is to maximize the expected number of rounds in which the lock is opened -- or equivalently, minimize the opportunity cost which is the expected number of rounds in which the chain has a correct key but our selected key is incorrect. We investigate Keychain Problems under three assumptions on the order in which keychains are tested by the locksmith: a fixed, known order; a random order sampled from a known distribution on a set of ``scenarios''; or an order selected by the locksmith themself. We present an exact algorithm for the simplest of these settings, and we present approximation algorithms and hardness results for the others. In the Probabilistic Scenarios setting, our approximation algorithm is based on a novel connection between combinatorial auctions and policy design for sequential decision-making problems. To illustrate the generality of this technique, we apply the same ideas to obtain Philosopher Inequalities for Online Bipartite Matching and some of its extensions.