A Systematic Analysis of Out-of-Distribution Detection Under Representation and Training Paradigm Shifts

📅 2025-11-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work systematically investigates how representation paradigms (CNNs vs. Vision Transformers) and training paradigms (training from scratch vs. fine-tuning) affect out-of-distribution (OOD) detection performance. Through extensive multi-dataset experiments, we identify feature-space geometry as the key determinant of detection efficacy: geometry-aware scoring functions exhibit superior robustness to large distribution shifts for CNNs, whereas gradient-based methods and kernel PCA perform competitively on ViTs. Methodologically, we introduce a novel joint AURC/AUGRC evaluation framework, augmented with Friedman’s test, Conover–Holm correction, and Bron–Kerbosch clique analysis, and propose PCA-projection-optimized multi-class OOD detectors. Empirically, probabilistic scores excel in misclassification detection; Monte Carlo Dropout (MCD) performance degrades with increasing class count; and lightweight PCA consistently enhances diverse detectors—providing strong evidence for a “representation-centric” paradigm in OOD detection.

Technology Category

Application Category

📝 Abstract
We present a systematic comparison of out-of-distribution (OOD) detection methods across CLIP-stratified regimes using AURC and AUGRC as primary metrics. Experiments cover two representation paradigms: CNNs trained from scratch and a fine-tuned Vision Transformer (ViT), evaluated on CIFAR-10/100, SuperCIFAR-100, and TinyImageNet. Using a multiple-comparison-controlled, rank-based pipeline (Friedman test with Conover-Holm post-hoc) and Bron-Kerbosch cliques, we find that the learned feature space largely determines OOD efficacy. For both CNNs and ViTs, probabilistic scores (e.g., MSR, GEN) dominate misclassification (ID) detection. Under stronger shifts, geometry-aware scores (e.g., NNGuide, fDBD, CTM) prevail on CNNs, whereas on ViTs GradNorm and KPCA Reconstruction Error remain consistently competitive. We further show a class-count-dependent trade-off for Monte-Carlo Dropout (MCD) and that a simple PCA projection improves several detectors. These results support a representation-centric view of OOD detection and provide statistically grounded guidance for method selection under distribution shift.
Problem

Research questions and friction points this paper is trying to address.

Systematically compares OOD detection methods across representation paradigms
Analyzes how learned feature space determines OOD detection efficacy
Provides statistical guidance for method selection under distribution shift
Innovation

Methods, ideas, or system contributions that make the work stand out.

Used CLIP-stratified regimes with AURC and AUGRC metrics
Compared CNNs and Vision Transformers across multiple datasets
Applied rank-based statistical pipeline and Bron-Kerbosch cliques
🔎 Similar Papers