Box-Constrained Softmax Function and Its Application for Post-Hoc Calibration

📅 2025-06-12

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

Softmax outputs lack hard probabilistic constraints, leading to unreliable model calibration. To address this, we propose Box-Constrained Softmax (BCSoftmax), the first method to impose explicit box constraints—i.e., analytically defined upper and lower bounds—directly on the softmax output space. BCSoftmax yields a differentiable, closed-form, and theoretically grounded probabilistic constraint mechanism. Building upon it, we design two novel post-hoc calibration methods: (1) analytical calibration driven by constrained optimization, and (2) boundary-aware logit rescaling. Extensive experiments on TinyImageNet, CIFAR-100, and 20NewsGroups demonstrate that BCSoftmax significantly reduces Expected Calibration Error (ECE) by an average of 38.2% and improves Brier Score, effectively mitigating both overconfidence and underconfidence. The approach enhances model trustworthiness and deployment robustness without compromising accuracy or inference efficiency.

Technology Category

Application Category

📝 Abstract

Controlling the output probabilities of softmax-based models is a common problem in modern machine learning. Although the $mathrm{Softmax}$ function provides soft control via its temperature parameter, it lacks the ability to enforce hard constraints, such as box constraints, on output probabilities, which can be critical in certain applications requiring reliable and trustworthy models. In this work, we propose the box-constrained softmax ($mathrm{BCSoftmax}$) function, a novel generalization of the $mathrm{Softmax}$ function that explicitly enforces lower and upper bounds on output probabilities. While $mathrm{BCSoftmax}$ is formulated as the solution to a box-constrained optimization problem, we develop an exact and efficient computation algorithm for $mathrm{BCSoftmax}$. As a key application, we introduce two post-hoc calibration methods based on $mathrm{BCSoftmax}$. The proposed methods mitigate underconfidence and overconfidence in predictive models by learning the lower and upper bounds of the output probabilities or logits after model training, thereby enhancing reliability in downstream decision-making tasks. We demonstrate the effectiveness of our methods experimentally using the TinyImageNet, CIFAR-100, and 20NewsGroups datasets, achieving improvements in calibration metrics.

Problem

Research questions and friction points this paper is trying to address.

Enforce hard constraints on softmax output probabilities

Propose box-constrained softmax for reliable model outputs

Improve calibration to mitigate underconfidence and overconfidence

Innovation

Methods, ideas, or system contributions that make the work stand out.

Box-constrained softmax enforces probability bounds

Efficient algorithm computes BCSoftmax optimization solution

Post-hoc calibration adjusts confidence bounds post-training

🔎 Similar Papers

No similar papers found.