Estimating Mixture Distributions via Stochastic Mirror Descent

📅 2026-05-24

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

This work addresses limitations of traditional distribution estimation methods—namely, computational inefficiency, reliance on prior knowledge of the support set, and limited model expressiveness—by formulating mixture distribution estimation as a stochastic convex optimization problem. The authors propose a unified framework based on Stochastic Mirror Descent (SMD) that minimizes cross-entropy loss over a mixture model space spanned by M base distributions. By flexibly selecting Bregman divergences, the approach eliminates the need for precise prior knowledge of the discrete support set and scales efficiently to large collections of base distributions. Theoretical analysis demonstrates that the algorithm achieves near-optimal convergence rates under both KL divergence and ℓ₂ norm, substantially improving sample efficiency and scalability, particularly in resource-constrained settings.

📝 Abstract

We revisit the classical problem of estimating an unknown distribution from its samples by fitting a mixture model that minimizes cross-entropy loss. Framing the task as a stochastic convex optimization problem over the space of $ M $-component mixture distributions, we propose a family of estimators derived from the stochastic mirror descent (SMD) algorithm. This optimization-based approach provides a principled and flexible framework that generalizes traditional estimators and proposes a variety of novel estimators through the choice of Bregman divergences. A key advantage of our method is that it scales efficiently with the number of candidate components $ f_i $; that is, one can employ a large set of basis distributions in the mixture model without incurring significant computational overhead. This enables richer approximations and improved estimation accuracy. Moreover, in the case of categorical distribution (discrete outcomes) our estimators do not require a strict lower bound, in other words our framework does not require the precise knowledge of the support of the distribution. We demonstrate that, under mild conditions, the proposed $ \varphi $-SMD estimators achieve near-optimal convergence rates in both Kullback-Leibler (KL) divergence and $ \ell_2 $-norm and offer practical benefits when computation is expensive. Our numerical analysis highlights improved performance guaranties over classical estimators, particularly in terms of sample efficiency and scalability.

Problem

Research questions and friction points this paper is trying to address.

mixture distribution

distribution estimation

cross-entropy loss

stochastic optimization

KL divergence

Innovation

Methods, ideas, or system contributions that make the work stand out.

Stochastic Mirror Descent

Mixture Distribution Estimation

Bregman Divergence