Binned semiparametric Bayesian networks

📅 2025-06-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high computational cost of kernel density estimation (KDE) and the curse of dimensionality plaguing conventional binning models in high-dimensional nonparametric distribution modeling, this paper proposes a semiparametric Bayesian network. The method builds upon data binning and innovatively integrates sparse tensor representations with Fourier-based KDE to construct a bin-adaptive semiparametric framework for conditional probability estimation—reducing computational complexity while mitigating high-dimensional sparsity. Compared to existing approaches, the model significantly enhances scalability in both structure learning and probabilistic inference. Empirical results demonstrate that it matches state-of-the-art semiparametric models in log-likelihood and structural accuracy, while achieving 10×–100× speedups in training and inference. The approach thus combines theoretical rigor with practical engineering utility.

Technology Category

Application Category

📝 Abstract
This paper introduces a new type of probabilistic semiparametric model that takes advantage of data binning to reduce the computational cost of kernel density estimation in nonparametric distributions. Two new conditional probability distributions are developed for the new binned semiparametric Bayesian networks, the sparse binned kernel density estimation and the Fourier kernel density estimation. These two probability distributions address the curse of dimensionality, which typically impacts binned models, by using sparse tensors and restricting the number of parent nodes in conditional probability calculations. To evaluate the proposal, we perform a complexity analysis and conduct several comparative experiments using synthetic data and datasets from the UCI Machine Learning repository. The experiments include different binning rules, parent restrictions, grid sizes, and number of instances to get a holistic view of the model's behavior. As a result, our binned semiparametric Bayesian networks achieve structural learning and log-likelihood estimations with no statistically significant differences compared to the semiparametric Bayesian networks, but at a much higher speed. Thus, the new binned semiparametric Bayesian networks prove to be a reliable and more efficient alternative to their non-binned counterparts.
Problem

Research questions and friction points this paper is trying to address.

Reducing computational cost in kernel density estimation
Addressing curse of dimensionality in binned models
Improving speed of Bayesian networks without sacrificing accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Binned semiparametric Bayesian networks reduce computational cost
Sparse tensors and parent restrictions address dimensionality
Faster structural learning with comparable accuracy
🔎 Similar Papers
No similar papers found.
R
Rafael Sojo
Aingura IIoT, Paseo Mikeletegui 43, 20009 Donostia-San Sebastián, Gipuzkoa, Spain; Universidad Politécnica de Madrid, Departamento de Inteligencia Artificial, 28660 Boadilla del Monte, Madrid, Spain
J
Javier Díaz-Rozo
Aingura IIoT, Paseo Mikeletegui 43, 20009 Donostia-San Sebastián, Gipuzkoa, Spain
Concha Bielza
Concha Bielza
Professor of Statistics and Operations Research, Technical University of Madrid / ELLIS Fellow
Machine LearningBayesian networksComputational NeuroscienceIndustry 4.0Influence Diagrams
Pedro Larrañaga
Pedro Larrañaga
Professor of Artificial Intelligence - Universidad Politécnica de Madrid
Bayesian NetworksFeature SelectionEstimation of Distribution AlgorithmsIndustry 4.0