Universal Training of Neural Networks to Achieve Bayes Optimal Classification Accuracy

📅 2025-01-13

📈 Citations: 0

✨ Influential: 0

career value

226K/year

🤖 AI Summary

This work aims to minimize the Bayes error rate for classification tasks. To this end, we propose the BOLT loss function: a differentiable upper bound on the Bayes error rate, constructed via f-divergence, and made end-to-end optimizable through sampling from model outputs. Unlike conventional approaches that indirectly approximate the optimal decision rule, BOLT directly embeds Bayes optimality into the training objective, with theoretical guarantees of convergence to the Bayes error rate. Experiments on MNIST, Fashion-MNIST, CIFAR-10, and IMDb demonstrate that BOLT matches or surpasses cross-entropy performance. Notably, it exhibits significantly improved generalization under challenging conditions—such as distribution shift and high label noise—validating both its theoretical advantages and practical utility.

Technology Category

Application Category

📝 Abstract

This work invokes the notion of $f$-divergence to introduce a novel upper bound on the Bayes error rate of a general classification task. We show that the proposed bound can be computed by sampling from the output of a parameterized model. Using this practical interpretation, we introduce the Bayes optimal learning threshold (BOLT) loss whose minimization enforces a classification model to achieve the Bayes error rate. We validate the proposed loss for image and text classification tasks, considering MNIST, Fashion-MNIST, CIFAR-10, and IMDb datasets. Numerical experiments demonstrate that models trained with BOLT achieve performance on par with or exceeding that of cross-entropy, particularly on challenging datasets. This highlights the potential of BOLT in improving generalization.

Problem

Research questions and friction points this paper is trying to address.

Neural Networks

Classification Accuracy

Minimum Error Rate

Innovation

Methods, ideas, or system contributions that make the work stand out.

BOLT

f-divergence

Minimum Error Rate Estimation

🔎 Similar Papers

No similar papers found.