🤖 AI Summary
This work addresses the secure deployment of deep learning models in open-world settings, focusing on reliable uncertainty estimation for out-of-distribution (OOD) samples at test time. We propose the first theoretical framework grounded in linearized training dynamics, deriving a differentiable, posterior-style upper bound on predictive uncertainty under weight perturbations—without requiring retraining. Our method integrates training dynamic modeling, stochastic weight perturbation sampling, and prediction ensembling, accompanied by rigorous error-bound analysis. Evaluated on large-scale image-based OOD benchmarks, it achieves state-of-the-art performance, particularly improving detection accuracy for near-OOD samples. The approach bridges theoretical interpretability with practical efficiency, offering both provable guarantees and computational scalability for real-world deployment.
📝 Abstract
A reliable uncertainty estimation method is the foundation of many modern out-of-distribution (OOD) detectors, which are critical for safe deployments of deep learning models in the open world. In this work, we propose TULiP, a theoretically-driven post-hoc uncertainty estimator for OOD detection. Our approach considers a hypothetical perturbation applied to the network before convergence. Based on linearized training dynamics, we bound the effect of such perturbation, resulting in an uncertainty score computable by perturbing model parameters. Ultimately, our approach computes uncertainty from a set of sampled predictions. We visualize our bound on synthetic regression and classification datasets. Furthermore, we demonstrate the effectiveness of TULiP using large-scale OOD detection benchmarks for image classification. Our method exhibits state-of-the-art performance, particularly for near-distribution samples.