Differentially Private Distribution Release of Gaussian Mixture Models via KL-Divergence Minimization

📅 2025-06-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of releasing Gaussian mixture model (GMM) parameters under differential privacy (DP). We propose the first (ε,δ)-DP optimization framework that unifies utility evaluation via KL divergence, simultaneously protecting mixture weights, component means, and covariance matrices. Our key innovation lies in establishing an analytical relationship between privacy budget allocation and the resulting KL divergence bound, enabling principled trade-offs. To achieve this, we design a joint perturbation mechanism combining Gaussian noise for means and Wishart noise for covariances, and solve the resulting non-convex, privacy-constrained optimization problem. Experiments on multiple synthetic and real-world datasets demonstrate that, for ε ∈ [0.5, 8], our method reduces KL error by 37%–62% over state-of-the-art baselines while strictly satisfying (ε,δ)-DP guarantees—significantly advancing the privacy–utility Pareto frontier.

Technology Category

Application Category

📝 Abstract
Gaussian Mixture Models (GMMs) are widely used statistical models for representing multi-modal data distributions, with numerous applications in data mining, pattern recognition, data simulation, and machine learning. However, recent research has shown that releasing GMM parameters poses significant privacy risks, potentially exposing sensitive information about the underlying data. In this paper, we address the challenge of releasing GMM parameters while ensuring differential privacy (DP) guarantees. Specifically, we focus on the privacy protection of mixture weights, component means, and covariance matrices. We propose to use Kullback-Leibler (KL) divergence as a utility metric to assess the accuracy of the released GMM, as it captures the joint impact of noise perturbation on all the model parameters. To achieve privacy, we introduce a DP mechanism that adds carefully calibrated random perturbations to the GMM parameters. Through theoretical analysis, we quantify the effects of privacy budget allocation and perturbation statistics on the DP guarantee, and derive a tractable expression for evaluating KL divergence. We formulate and solve an optimization problem to minimize the KL divergence between the released and original models, subject to a given $(epsilon, delta)$-DP constraint. Extensive experiments on both synthetic and real-world datasets demonstrate that our approach achieves strong privacy guarantees while maintaining high utility.
Problem

Research questions and friction points this paper is trying to address.

Releasing Gaussian Mixture Models with differential privacy guarantees
Protecting mixture weights, means, and covariances from privacy risks
Minimizing KL-divergence for accurate yet private GMM parameter release
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses KL-divergence for GMM utility metric
Introduces DP mechanism with calibrated noise
Optimizes KL-divergence under DP constraints
🔎 Similar Papers
No similar papers found.
H
Hang Liu
Department of Electrical and Computer Engineering, Cornell Tech, Cornell University, New York, NY , 10044 USA
Anna Scaglione
Anna Scaglione
Professor of Electrical and Computer Engineering, Cornell University
signal processingnetworksenergy
S
S. Peisert
Computing Sciences Research, Lawrence Berkeley National Laboratory, Berkeley, CA 94720 USA