Learning to Select and Rank from Choice-Based Feedback: A Simple Nested Approach

📅 2023-07-13

📈 Citations: 1

✨ Influential: 0

career value

206K/year

🤖 AI Summary

This paper studies preference learning from choice feedback over a dynamic item set, aiming to identify either the optimal item or the complete preference ranking with minimal samples and high confidence. We propose two algorithms—Nested Elimination (NE) and Nested Partitioning (NP)—that provide the first non-asymptotic, instance-dependent sample complexity bounds for arbitrary strict preference structures; NE achieves asymptotic optimality in the information-theoretic worst case, while NP attains constant-factor optimality. Our approach integrates multi-dimensional random walk modeling, divide-and-conquer strategies, and an information-theoretic analysis framework. Rigorous theoretical analysis is complemented by extensive experiments on both synthetic and real-world datasets, demonstrating superior efficiency and robustness. The core contribution is the establishment of the first algorithmic framework for dynamic preference learning that simultaneously ensures practical applicability and theoretical optimality.

📝 Abstract

We study a ranking and selection problem of learning from choice-based feedback with dynamic assortments. In this problem, a company sequentially displays a set of items to a population of customers and collects their choices as feedback. The only information available about the underlying choice model is that the choice probabilities are consistent with some unknown true strict ranking over the items. The objective is to identify, with the fewest samples, the most preferred item or the full ranking over the items at a high confidence level. We present novel and simple algorithms for both learning goals. In the first subproblem regarding best-item identification, we introduce an elimination-based algorithm, Nested Elimination (NE). In the more complex subproblem regarding full-ranking identification, we generalize NE and propose a divide-and-conquer algorithm, Nested Partition (NP). We provide strong characterizations of both algorithms through instance-specific and non-asymptotic bounds on the sample complexity. This is accomplished using an analytical framework that characterizes the system dynamics through analyzing a sequence of multi-dimensional random walks. We also establish a connection between our nested approach and the information-theoretic lower bounds. We thus show that NE is worst-case asymptotically optimal, and NP is optimal up to a constant factor. Finally, numerical experiments from both synthetic and real data corroborate our theoretical findings.

Problem

Research questions and friction points this paper is trying to address.

Customer Feedback

Product Popularity

Efficient Determination

Innovation

Methods, ideas, or system contributions that make the work stand out.

NE Algorithm

NP Algorithm

Preference Learning

🔎 Similar Papers

No similar papers found.