🤖 AI Summary
This paper addresses the offline data-driven assortment optimization problem, where firms lack prior knowledge of choice models and historical transaction data suffers from insufficient coverage due to combinatorial explosion. To tackle this, we propose PASTA—a novel framework that introduces the pessimistic principle to this setting for the first time. Under the minimal assumption that the optimal assortment resides within the historical dataset, PASTA achieves a minimax regret bound. Grounded in finite-sample theory, it designs provably effective algorithms applicable to broad choice models, including multinomial logit and nested logit. Theoretically, PASTA establishes the first tight finite-sample regret bound—optimal both in sample complexity and model complexity. Empirically, it significantly outperforms existing baseline methods across diverse benchmarks.
📝 Abstract
We study a broad class of assortment optimization problems in an offline and data-driven setting. In such problems, a firm lacks prior knowledge of the underlying choice model, and aims to determine an optimal assortment based on historical customer choice data. The combinatorial nature of assortment optimization often results in insufficient data coverage, posing a significant challenge in designing provably effective solutions. To address this, we introduce a novel Pessimistic Assortment Optimization (PASTA) framework that leverages the principle of pessimism to achieve optimal expected revenue under general choice models. Notably, PASTA requires only that the offline data distribution contains an optimal assortment, rather than providing the full coverage of all feasible assortments. Theoretically, we establish the first finite-sample regret bounds for offline assortment optimization across several widely used choice models, including the multinomial logit and nested logit models. Additionally, we derive a minimax regret lower bound, proving that PASTA is minimax optimal in terms of sample and model complexity. Numerical experiments further demonstrate that our method outperforms existing baseline approaches.