Big Batch Bayesian Active Learning by Considering Predictive Probabilities

📅 2025-01-14

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

BatchBALD conflates epistemic and aleatoric uncertainty, limiting scalability and performance in large-scale Bayesian active learning. Method: We propose PureBALD—the first acquisition function that models epistemic uncertainty *exclusively* via predictive probabilities, fully decoupling it from aleatoric uncertainty. It introduces an efficient, scalable batch acquisition objective that avoids the combinatorial bottleneck of joint sample evaluation inherent in BatchBALD. Built upon Bayesian neural networks, Monte Carlo sampling, and mutual information approximation, PureBALD optimizes both predictive entropy and joint epistemic uncertainty. Contribution/Results: Across multiple image and text benchmarks, PureBALD consistently outperforms BatchBALD—achieving 3–5× higher query efficiency, supporting batch sizes up to 1,000, faster convergence, and superior generalization.

Technology Category

Application Category

📝 Abstract

We observe that BatchBALD, a popular acquisition function for batch Bayesian active learning for classification, can conflate epistemic and aleatoric uncertainty, leading to suboptimal performance. Motivated by this observation, we propose to focus on the predictive probabilities, which only exhibit epistemic uncertainty. The result is an acquisition function that not only performs better, but is also faster to evaluate, allowing for larger batches than before.

Problem

Research questions and friction points this paper is trying to address.

BatchBALD

Uncertainty

Bayesian Active Learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian Active Learning

BatchBALD Improvement

Large-scale Dataset Handling

🔎 Similar Papers

No similar papers found.