๐ค AI Summary
Traditional Bayesian network classifiers suffer from parameter explosion and data sparsity, limiting their ability to model high-order feature dependencies and resulting in poor probability extrapolation on real-world data. To address this, we propose NeuralKDBโthe first framework to integrate distributional representation learning into the KDB (k-dependence Bayesian) paradigm. NeuralKDB employs neural networks to learn low-dimensional distributed representations of feature values and incorporates co-occurrence pattern encoding to capture semantic associations among features, thereby enabling efficient modeling of high-order dependencies. We design an end-to-end training algorithm based on stochastic gradient descent. Extensive experiments across 60 UCI benchmark datasets demonstrate that NeuralKDB significantly outperforms classical Bayesian classifiers and state-of-the-art competing models, achieving breakthrough improvements in both classification accuracy and high-order dependency modeling capability.
๐ Abstract
Bayesian network classifiers provide a feasible solution to tabular data classification, with a number of merits like high time and memory efficiency, and great explainability. However, due to the parameter explosion and data sparsity issues, Bayesian network classifiers are restricted to low-order feature dependency modeling, making them struggle in extrapolating the occurrence probabilities of complex real-world data. In this paper, we propose a novel paradigm to design high-order Bayesian network classifiers, by learning distributional representations for feature values, as what has been done in word embedding and graph representation learning. The learned distributional representations are encoded with the semantic relatedness between different features through their observed co-occurrence patterns in training data, which then serve as a hallmark to extrapolate the occurrence probabilities of new test samples. As a classifier design realization, we remake the K-dependence Bayesian classifier (KDB) by extending it into a neural version, i.e., NeuralKDB, where a novel neural network architecture is designed to learn distributional representations of feature values and parameterize the conditional probabilities between interdependent features. A stochastic gradient descent based algorithm is designed to train the NeuralKDB model efficiently. Extensive classification experiments on 60 UCI datasets demonstrate that the proposed NeuralKDB classifier excels in capturing high-order feature dependencies and significantly outperforms the conventional Bayesian network classifiers, as well as other competitive classifiers, including two neural network based classifiers without distributional representation learning.