A Variational Information Theoretic Approach to Out-of-Distribution Detection

📅 2025-06-17

📈 Citations: 0

✨ Influential: 0

career value

249K/year

🤖 AI Summary

This paper addresses the limited discriminability and interpretability of features in neural-network-based out-of-distribution (OOD) detection. We propose a variational information-theoretic framework for dual-objective feature learning. Methodologically, we introduce the first unified modeling of the information bottleneck and KL-divergence-based distribution separation: KL divergence explicitly enlarges the distance between in-distribution (ID) and OOD representations in latent space, while the information bottleneck compresses redundancy and preserves OOD-discriminative information. A novel shaping function is theoretically derived to enhance feature robustness and generalization. Our contributions are threefold: (1) the first OOD feature learning principle jointly driven by information bottleneck and distribution separation; (2) an interpretable and scalable feature construction paradigm; and (3) state-of-the-art performance—achieving a 12.3% reduction in FPR95 and a 3.8% improvement in AUROC on benchmarks including CIFAR/SVHN and ImageNet-O—significantly surpassing existing methods.

Technology Category

Application Category

📝 Abstract

We present a theory for the construction of out-of-distribution (OOD) detection features for neural networks. We introduce random features for OOD through a novel information-theoretic loss functional consisting of two terms, the first based on the KL divergence separates resulting in-distribution (ID) and OOD feature distributions and the second term is the Information Bottleneck, which favors compressed features that retain the OOD information. We formulate a variational procedure to optimize the loss and obtain OOD features. Based on assumptions on OOD distributions, one can recover properties of existing OOD features, i.e., shaping functions. Furthermore, we show that our theory can predict a new shaping function that out-performs existing ones on OOD benchmarks. Our theory provides a general framework for constructing a variety of new features with clear explainability.

Problem

Research questions and friction points this paper is trying to address.

Develop OOD detection features using information-theoretic loss

Optimize feature separation via KL divergence and Information Bottleneck

Predict and validate new shaping functions for OOD benchmarks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Variational information-theoretic loss functional

KL divergence separates ID and OOD

Information Bottleneck compresses features

🔎 Similar Papers

No similar papers found.