Towards the Next-generation Bayesian Network Classifiers

๐Ÿ“… 2025-08-14
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Traditional Bayesian network classifiers suffer from parameter explosion and data sparsity, limiting their ability to model high-order feature dependencies and resulting in poor probability extrapolation on real-world data. To address this, we propose NeuralKDBโ€”the first framework to integrate distributional representation learning into the KDB (k-dependence Bayesian) paradigm. NeuralKDB employs neural networks to learn low-dimensional distributed representations of feature values and incorporates co-occurrence pattern encoding to capture semantic associations among features, thereby enabling efficient modeling of high-order dependencies. We design an end-to-end training algorithm based on stochastic gradient descent. Extensive experiments across 60 UCI benchmark datasets demonstrate that NeuralKDB significantly outperforms classical Bayesian classifiers and state-of-the-art competing models, achieving breakthrough improvements in both classification accuracy and high-order dependency modeling capability.

Technology Category

Application Category

๐Ÿ“ Abstract
Bayesian network classifiers provide a feasible solution to tabular data classification, with a number of merits like high time and memory efficiency, and great explainability. However, due to the parameter explosion and data sparsity issues, Bayesian network classifiers are restricted to low-order feature dependency modeling, making them struggle in extrapolating the occurrence probabilities of complex real-world data. In this paper, we propose a novel paradigm to design high-order Bayesian network classifiers, by learning distributional representations for feature values, as what has been done in word embedding and graph representation learning. The learned distributional representations are encoded with the semantic relatedness between different features through their observed co-occurrence patterns in training data, which then serve as a hallmark to extrapolate the occurrence probabilities of new test samples. As a classifier design realization, we remake the K-dependence Bayesian classifier (KDB) by extending it into a neural version, i.e., NeuralKDB, where a novel neural network architecture is designed to learn distributional representations of feature values and parameterize the conditional probabilities between interdependent features. A stochastic gradient descent based algorithm is designed to train the NeuralKDB model efficiently. Extensive classification experiments on 60 UCI datasets demonstrate that the proposed NeuralKDB classifier excels in capturing high-order feature dependencies and significantly outperforms the conventional Bayesian network classifiers, as well as other competitive classifiers, including two neural network based classifiers without distributional representation learning.
Problem

Research questions and friction points this paper is trying to address.

Overcome low-order dependency limits in Bayesian classifiers
Address parameter explosion and data sparsity issues
Improve probability extrapolation for complex real-world data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Learns distributional representations for feature values
Extends KDB into neural version NeuralKDB
Uses SGD for efficient NeuralKDB training
๐Ÿ”Ž Similar Papers
No similar papers found.
H
Huan Zhang
School of Computer and Artificial Intelligence, Zhengzhou University
Daokun Zhang
Daokun Zhang
University of Nottingham Ningbo China
Graph LearningData MiningMachine Learning
K
Kexin Meng
School of Computer and Artificial Intelligence, Zhengzhou University
G
Geoffrey I. Webb
Department of Data Science and AI, Monash University