๐ค AI Summary
This paper addresses the estimation of location $mu$ and scale $sigma$ parameters for density families of the form $f(x) = sigma^{-l} f_0((x - mu)/sigma)$, where $f_0$ is known and $n$ i.i.d. samples are given. We establish, for the first time, that maximum likelihood estimation (MLE) in this model is NP-hardโrendering standard likelihood-based optimization computationally intractable. To overcome this barrier, we propose a non-convex optimization framework based on minimizing the Wasserstein distance between the empirical distribution and the parametric model. We prove that, for any $f_0$, this method achieves $varepsilon$-accurate estimation of $(mu, sigma)$ in $mathrm{poly}(1/varepsilon)$ time. Our approach circumvents the fundamental computational limitations of the likelihood paradigm while preserving statistical robustness. It yields the first polynomial-time approximation algorithm for parameter estimation in general location-scale families, unifying computational efficiency with statistical consistency.
๐ Abstract
Parameter estimation is a fundamental challenge in machine learning, crucial for tasks such as neural network weight fitting and Bayesian inference. This paper focuses on the complexity of estimating translation $oldsymbol{mu} in mathbb{R}^l$ and shrinkage $sigma in mathbb{R}_{++}$ parameters for a distribution of the form $frac{1}{sigma^l} f_0 left( frac{oldsymbol{x} - oldsymbol{mu}}{sigma}
ight)$, where $f_0$ is a known density in $mathbb{R}^l$ given $n$ samples. We highlight that while the problem is NP-hard for Maximum Likelihood Estimation (MLE), it is possible to obtain $varepsilon$-approximations for arbitrary $varepsilon>0$ within $ ext{poly} left( frac{1}{varepsilon}
ight)$ time using the Wasserstein distance.