Towards the Next-generation Bayesian Network Classifiers

📅 2025-08-14

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

Traditional Bayesian network classifiers suffer from parameter explosion and data sparsity, limiting their ability to model high-order feature dependencies and resulting in poor probability extrapolation on real-world data. To address this, we propose NeuralKDB—the first framework to integrate distributional representation learning into the KDB (k-dependence Bayesian) paradigm. NeuralKDB employs neural networks to learn low-dimensional distributed representations of feature values and incorporates co-occurrence pattern encoding to capture semantic associations among features, thereby enabling efficient modeling of high-order dependencies. We design an end-to-end training algorithm based on stochastic gradient descent. Extensive experiments across 60 UCI benchmark datasets demonstrate that NeuralKDB significantly outperforms classical Bayesian classifiers and state-of-the-art competing models, achieving breakthrough improvements in both classification accuracy and high-order dependency modeling capability.

Technology Category

Application Category

📝 Abstract

Bayesian network classifiers provide a feasible solution to tabular data classification, with a number of merits like high time and memory efficiency, and great explainability. However, due to the parameter explosion and data sparsity issues, Bayesian network classifiers are restricted to low-order feature dependency modeling, making them struggle in extrapolating the occurrence probabilities of complex real-world data. In this paper, we propose a novel paradigm to design high-order Bayesian network classifiers, by learning distributional representations for feature values, as what has been done in word embedding and graph representation learning. The learned distributional representations are encoded with the semantic relatedness between different features through their observed co-occurrence patterns in training data, which then serve as a hallmark to extrapolate the occurrence probabilities of new test samples. As a classifier design realization, we remake the K-dependence Bayesian classifier (KDB) by extending it into a neural version, i.e., NeuralKDB, where a novel neural network architecture is designed to learn distributional representations of feature values and parameterize the conditional probabilities between interdependent features. A stochastic gradient descent based algorithm is designed to train the NeuralKDB model efficiently. Extensive classification experiments on 60 UCI datasets demonstrate that the proposed NeuralKDB classifier excels in capturing high-order feature dependencies and significantly outperforms the conventional Bayesian network classifiers, as well as other competitive classifiers, including two neural network based classifiers without distributional representation learning.

Problem

Research questions and friction points this paper is trying to address.

Overcome low-order dependency limits in Bayesian classifiers

Address parameter explosion and data sparsity issues

Improve probability extrapolation for complex real-world data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Learns distributional representations for feature values

Extends KDB into neural version NeuralKDB

Uses SGD for efficient NeuralKDB training

🔎 Similar Papers

No similar papers found.