Mean and Variance Estimation Complexity in Arbitrary Distributions via Wasserstein Minimization

📅 2025-01-17

📈 Citations: 0

✨ Influential: 0

career value

244K/year

🤖 AI Summary

This paper addresses the estimation of location $mu$ and scale $sigma$ parameters for density families of the form $f(x) = sigma^{-l} f_0((x - mu)/sigma)$, where $f_0$ is known and $n$ i.i.d. samples are given. We establish, for the first time, that maximum likelihood estimation (MLE) in this model is NP-hard—rendering standard likelihood-based optimization computationally intractable. To overcome this barrier, we propose a non-convex optimization framework based on minimizing the Wasserstein distance between the empirical distribution and the parametric model. We prove that, for any $f_0$, this method achieves $varepsilon$-accurate estimation of $(mu, sigma)$ in $mathrm{poly}(1/varepsilon)$ time. Our approach circumvents the fundamental computational limitations of the likelihood paradigm while preserving statistical robustness. It yields the first polynomial-time approximation algorithm for parameter estimation in general location-scale families, unifying computational efficiency with statistical consistency.

Technology Category

Application Category

📝 Abstract

Parameter estimation is a fundamental challenge in machine learning, crucial for tasks such as neural network weight fitting and Bayesian inference. This paper focuses on the complexity of estimating translation $oldsymbol{mu} in mathbb{R}^l$ and shrinkage $sigma in mathbb{R}_{++}$ parameters for a distribution of the form $frac{1}{sigma^l} f_0 left( frac{oldsymbol{x} - oldsymbol{mu}}{sigma} ight)$, where $f_0$ is a known density in $mathbb{R}^l$ given $n$ samples. We highlight that while the problem is NP-hard for Maximum Likelihood Estimation (MLE), it is possible to obtain $varepsilon$-approximations for arbitrary $varepsilon>0$ within $ ext{poly} left( frac{1}{varepsilon} ight)$ time using the Wasserstein distance.

Problem

Research questions and friction points this paper is trying to address.

Parameter Estimation

Machine Learning

Bayesian Inference

Innovation

Methods, ideas, or system contributions that make the work stand out.

Wasserstein distance

parameter estimation

computational efficiency

🔎 Similar Papers

Wasserstein Gradient Flow over Variational Parameter Space for Variational Inference