Diffusion-Inspired Reconfiguration of Transformers for Uncertainty Calibration

📅 2026-02-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Pretrained Transformers lack an effective mechanism for uncertainty propagation, limiting their reliability in risk-sensitive scenarios. This work proposes the first integration of diffusion process principles into the Transformer architecture by modeling each feature transformation module as a probabilistic mapping, thereby establishing an end-to-end pathway for propagating uncertainty from the input distribution to the pretrained feature distribution. The resulting model significantly enhances calibration performance without compromising predictive accuracy. Empirical evaluations across multiple vision-and-language benchmarks demonstrate that this approach outperforms existing uncertainty-aware Transformers, offering a more reliable framework for applications where calibrated uncertainty estimates are critical.

Technology Category

Application Category

📝 Abstract
Uncertainty calibration in pre-trained transformers is critical for their reliable deployment in risk-sensitive applications. Yet, most existing pre-trained transformers do not have a principled mechanism for uncertainty propagation through their feature transformation stack. In this work, we propose a diffusion-inspired reconfiguration of transformers in which each feature transformation block is modeled as a probabilistic mapping. Composing these probabilistic mappings reveals a probability path that mimics the structure of a diffusion process, transporting data mass from the input distribution to the pre-trained feature distribution. This probability path can then be recompiled on a diffusion process with a unified transition model to enable principled propagation of representation uncertainty throughout the pre-trained model's architecture while maintaining its original predictive performance. Empirical results across a variety of vision and language benchmarks demonstrate that our method achieves superior calibration and predictive accuracy compared to existing uncertainty-aware transformers.
Problem

Research questions and friction points this paper is trying to address.

uncertainty calibration
pre-trained transformers
uncertainty propagation
feature transformation
reliable deployment
Innovation

Methods, ideas, or system contributions that make the work stand out.

diffusion process
uncertainty calibration
probabilistic mapping
transformer reconfiguration
representation uncertainty
🔎 Similar Papers
2023-03-04International Conference on Learning RepresentationsCitations: 13
Manh Cuong Dao
Manh Cuong Dao
National University of Singapore
Machine Learning
Q
Quang Hung Pham
School of Information and Communications Technology, Hanoi University of Science and Technology, Hanoi, Vietnam
P
Phi Le Nguyen
School of Information and Communications Technology, Hanoi University of Science and Technology, Hanoi, Vietnam
T
Thao Nguyen Truong
National Institute of Advanced Industrial Science and Technology, Tokyo, Japan
Bryan Kian Hsiang Low
Bryan Kian Hsiang Low
Associate Professor (with tenure), Department of Computer Science, National University of Singapore
Bayesian OptimizationGaussian ProcessesFederated LearningData-centric AIData Valuation
Trong Nghia Hoang
Trong Nghia Hoang
Assistant Professor, Washington State University
Machine LearningFederated LearningMeta LearningModel FusionGaussian Processes