Federated Variational Inference for Bayesian Mixture Models

📅 2025-02-18

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

To address privacy-sensitive clustering of large-scale binary and categorical data under federated learning, this paper proposes the first variational inference-based federated Bayesian mixture modeling framework. Each client performs local variational inference and uploads only lightweight sufficient statistics—never raw data—thereby ensuring strict data locality. Model structure discovery is achieved via intra-batch “merge-and-drop” and inter-batch “global merge” strategies, preserving global statistical consistency. The framework provides formal privacy guarantees without compromising model expressiveness or clustering fidelity. Extensive experiments on synthetic data, benchmark datasets, and real-world large-scale electronic health records (EHR) demonstrate that our method significantly outperforms state-of-the-art federated and centralized clustering algorithms in accuracy, while exhibiting superior scalability and robust privacy protection.

Technology Category

Application Category

📝 Abstract

We present a federated learning approach for Bayesian model-based clustering of large-scale binary and categorical datasets. We introduce a principled 'divide and conquer' inference procedure using variational inference with local merge and delete moves within batches of the data in parallel, followed by 'global' merge moves across batches to find global clustering structures. We show that these merge moves require only summaries of the data in each batch, enabling federated learning across local nodes without requiring the full dataset to be shared. Empirical results on simulated and benchmark datasets demonstrate that our method performs well in comparison to existing clustering algorithms. We validate the practical utility of the method by applying it to large scale electronic health record (EHR) data.

Problem

Research questions and friction points this paper is trying to address.

Federated learning for Bayesian clustering

Scalable inference for large datasets

Privacy-preserving global clustering

Innovation

Methods, ideas, or system contributions that make the work stand out.

Federated learning for Bayesian models

Variational inference with local moves

Global merge without full data sharing

🔎 Similar Papers

No similar papers found.