Mean Estimation from Coarse Data: Characterizations and Efficient Algorithms

๐Ÿ“… 2026-02-26
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work investigates the identifiability and efficient estimation of the mean of a high-dimensional Gaussian distribution when only coarse-grained convex set memberships of samples are observableโ€”such as due to rounding or sensor limitations. By integrating tools from convex geometry, statistical identifiability theory, and optimization, the paper establishes the first complete characterization of necessary and sufficient conditions under which the mean is identifiable. Building on this characterization, it proposes the first estimator that is both computationally efficient (running in polynomial time) and statistically sample-efficient, thereby resolving two long-standing theoretical challenges in this observational setting.

Technology Category

Application Category

๐Ÿ“ Abstract
Coarse data arise when learners observe only partial information about samples; namely, a set containing the sample rather than its exact value. This occurs naturally through measurement rounding, sensor limitations, and lag in economic systems. We study Gaussian mean estimation from coarse data, where each true sample $x$ is drawn from a $d$-dimensional Gaussian distribution with identity covariance, but is revealed only through the set of a partition containing $x$. When the coarse samples, roughly speaking, have ``low''information, the mean cannot be uniquely recovered from observed samples (i.e., the problem is not identifiable). Recent work by Fotakis, Kalavasis, Kontonis, and Tzamos [FKKT21] established that sample-efficient mean estimation is possible when the unknown mean is identifiable and the partition consists of only convex sets. Moreover, they showed that without convexity, mean estimation becomes NP-hard. However, two fundamental questions remained open: (1) When is the mean identifiable under convex partitions? (2) Is computationally efficient estimation possible under identifiability and convex partitions? This work resolves both questions. [...]
Problem

Research questions and friction points this paper is trying to address.

coarse data
mean estimation
identifiability
Gaussian distribution
convex partitions
Innovation

Methods, ideas, or system contributions that make the work stand out.

coarse data
mean estimation
identifiability
convex partitions
efficient algorithms
๐Ÿ”Ž Similar Papers
No similar papers found.