🤖 AI Summary
This paper addresses the computation of the Wasserstein barycenter under ε-differential privacy constraints. The Wasserstein barycenter—defined as the Fréchet mean of a set of probability distributions with respect to the Wasserstein metric—plays a key role in machine learning and statistical analysis; however, its input empirical distributions often derive from sensitive data, necessitating rigorous privacy protection. We propose the first ε-differentially private framework for Wasserstein barycenter computation: it injects noise into the empirical distributions guided by optimal transport geometry and refines the result via synthetic data optimization. Evaluated on real-world datasets—including MNIST and U.S. Census data—the method efficiently computes private barycenters while providing strict theoretical privacy guarantees. It substantially outperforms existing baselines, achieving the state-of-the-art privacy–accuracy trade-off. This work establishes a novel interdisciplinary bridge between differential privacy and optimal transport theory.
📝 Abstract
The Wasserstein barycenter is defined as the mean of a set of probability measures under the optimal transport metric, and has numerous applications spanning machine learning, statistics, and computer graphics. In practice these input measures are empirical distributions built from sensitive datasets, motivating a differentially private (DP) treatment. We present, to our knowledge, the first algorithms for computing Wasserstein barycenters under differential privacy. Empirically, on synthetic data, MNIST, and large-scale U.S. population datasets, our methods produce high-quality private barycenters with strong accuracy-privacy tradeoffs.