🤖 AI Summary
Deep learning models often lack reliable uncertainty quantification (UQ), and existing UQ methods are fragmented, with inconsistent evaluation protocols—hindering trustworthy deployment in safety-critical applications. To address this, we introduce the first modular, open-source UQ framework built on PyTorch Lightning, unifying support for classification, segmentation, and regression tasks. It integrates state-of-the-art UQ techniques—including Bayesian neural networks, deep ensembles, Monte Carlo Dropout, and temperature scaling—alongside standardized evaluation metrics (e.g., Expected Calibration Error, Brier Score, and AUROC for uncertainty). The framework adopts a decoupled architecture, enabling plug-and-play integration of UQ methods and fully automated evaluation pipelines. Extensive experiments across diverse benchmarks demonstrate that our framework significantly lowers the barrier to UQ adoption, improves evaluation efficiency, and enhances reproducibility—providing a systematic, production-ready toolkit for trustworthy AI.
📝 Abstract
Deep Neural Networks (DNNs) have demonstrated remarkable performance across various domains, including computer vision and natural language processing. However, they often struggle to accurately quantify the uncertainty of their predictions, limiting their broader adoption in critical real-world applications. Uncertainty Quantification (UQ) for Deep Learning seeks to address this challenge by providing methods to improve the reliability of uncertainty estimates. Although numerous techniques have been proposed, a unified tool offering a seamless workflow to evaluate and integrate these methods remains lacking. To bridge this gap, we introduce Torch-Uncertainty, a PyTorch and Lightning-based framework designed to streamline DNN training and evaluation with UQ techniques and metrics. In this paper, we outline the foundational principles of our library and present comprehensive experimental results that benchmark a diverse set of UQ methods across classification, segmentation, and regression tasks. Our library is available at https://github.com/ENSTA-U2IS-AI/Torch-Uncertainty