Scaling Probabilistic Circuits via Data Partitioning

📅 2025-03-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Probabilistic circuits (PCs) face severe scalability challenges in training and inference under large-scale distributed data settings. Method: This paper proposes the Federated Circuit (FC) framework, the first to unify federated learning as a distributed density estimation task, supporting horizontal, vertical, and hybrid federated learning within a single architecture. FC enables multi-machine collaborative learning via recursive data partitioning, seamlessly integrating PC’s expressive density modeling capabilities with federated learning principles. Contribution/Results: Experiments on multiple large-scale datasets demonstrate that FC significantly accelerates PC training and enhances scalability. It achieves consistent performance across multiclass classification tasks under all three federated learning paradigms, validating its general applicability. Moreover, the work reveals a fundamental theoretical connection between probabilistic circuits and federated learning at the level of density modeling—establishing PCs as principled, scalable models for privacy-preserving distributed density estimation.

Technology Category

Application Category

📝 Abstract
Probabilistic circuits (PCs) enable us to learn joint distributions over a set of random variables and to perform various probabilistic queries in a tractable fashion. Though the tractability property allows PCs to scale beyond non-tractable models such as Bayesian Networks, scaling training and inference of PCs to larger, real-world datasets remains challenging. To remedy the situation, we show how PCs can be learned across multiple machines by recursively partitioning a distributed dataset, thereby unveiling a deep connection between PCs and federated learning (FL). This leads to federated circuits (FCs) -- a novel and flexible federated learning (FL) framework that (1) allows one to scale PCs on distributed learning environments (2) train PCs faster and (3) unifies for the first time horizontal, vertical, and hybrid FL in one framework by re-framing FL as a density estimation problem over distributed datasets. We demonstrate FC's capability to scale PCs on various large-scale datasets. Also, we show FC's versatility in handling horizontal, vertical, and hybrid FL within a unified framework on multiple classification tasks.
Problem

Research questions and friction points this paper is trying to address.

Scaling probabilistic circuits for large datasets
Enabling faster training of probabilistic circuits
Unifying horizontal, vertical, and hybrid federated learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Scaling PCs via distributed dataset partitioning
Introducing Federated Circuits for scalable learning
Unifying horizontal, vertical, hybrid FL frameworks
🔎 Similar Papers
No similar papers found.
Jonas Seng
Jonas Seng
PhD Candidate, TU Darmstadt
Meta-LearningAutoMLCausalityNeural Architecture SearchHyperparameter Optimization
Florian Peter Busch
Florian Peter Busch
TU Darmstadt, PhD Candidate
Machine LearningArtificial IntelligenceCausalityXAI
P
P. Prasad
Department of Mathematics and Computer Science, Eindhoven University of Technology
D
D. Dhami
Department of Mathematics and Computer Science, Eindhoven University of Technology
Martin Mundt
Martin Mundt
Professor for Lifelong Machine Learning at University of Bremen
deep learninglifelong machine learningcontinual learning
K
K. Kersting
Computer Science Department, TU Darmstadt; Hessian Center for AI (hessian.AI); German Research Center for AI (DFKI)