🤖 AI Summary
Conventional gradient-based maximum likelihood estimation (MLE) for Multinomial Logit (MNL), Nested Logit (NL), and Tree-Nested Logit (TNL) models suffers from slow convergence, sensitivity to initial values, and numerical instability.
Method: This paper proposes a unified convex conic optimization framework that exactly reformulates the MLE problems of all three models as tractable exponential cone programs (ECPs). A nested bi-level optimization strategy is introduced to enhance convergence robustness, leveraging interior-point methods and off-the-shelf conic solvers (e.g., MOSEK) without manual hyperparameter tuning.
Contribution/Results: Experiments demonstrate that the method significantly outperforms traditional gradient-based approaches—especially in large-scale, high-dimensional settings—yielding higher-quality estimates, superior initialization robustness, and speedups of several-fold. To our knowledge, this is the first work to establish a unified conic optimization paradigm for parameter estimation in both multinomial and nested logit models.
📝 Abstract
In this paper, we revisit parameter estimation for multinomial logit (MNL), nested logit (NL), and tree-nested logit (TNL) models through the framework of convex conic optimization. Traditional approaches typically solve the maximum likelihood estimation (MLE) problem using gradient-based methods, which are sensitive to step-size selection and initialization, and may therefore suffer from slow or unstable convergence. In contrast, we propose a novel estimation strategy that reformulates these models as conic optimization problems, enabling more robust and reliable estimation procedures. Specifically, we show that the MLE for MNL admits an equivalent exponential cone program (ECP). For NL and TNL, we prove that when the dissimilarity (scale) parameters are fixed, the estimation problem is convex and likewise reducible to an ECP. Leveraging these results, we design a two-stage procedure: an outer loop that updates the scale parameters and an inner loop that solves the ECP to update the utility coefficients. The inner problems are handled by interior-point methods with iteration counts that grow only logarithmically in the target accuracy, as implemented in off-the-shelf solvers (e.g., MOSEK). Extensive experiments across estimation instances of varying size show that our conic approach attains better MLE solutions, greater robustness to initialization, and substantial speedups compared to standard gradient-based MLE, particularly on large-scale instances with high-dimensional specifications and large choice sets. Our findings establish exponential cone programming as a practical and scalable alternative for estimating a broad class of discrete choice models.