Scaling Pretrained Representations Enables Label-Free Out-of-Distribution Detection Without Fine-Tuning

📅 2026-05-06

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

Deep learning models often lack reliable mechanisms to flag out-of-distribution (OOD) inputs. This work proposes a label- and fine-tuning-free OOD detection method that leverages representations from frozen pretrained models, combining a global Mahalanobis distance estimator with a local diffusion-based typicality estimator, ReSCOPED. The study demonstrates that the intrinsic geometric structure of pretrained representations alone is sufficient for effective OOD detection, and that performance disparities among different detectors markedly diminish as model scale increases. Experiments across 59 vision-and-language task combinations reveal that both estimators improve in tandem with representation quality and converge in performance under large-scale models.

📝 Abstract

Models trained with deep learning often fail to signal when inputs fall outside their training data manifold, leading to unreliable predictions under distribution shift. Prior work suggests that effective out-of-distribution (OOD) detection often requires class-conditional modeling or specialized models obtained through supervised fine-tuning. We revisit this assumption in modern pretrained models and show that their frozen representations already encode sufficient geometric structure for accurate label-free OOD detection. Across 59 backbone-task pairings spanning vision and language, we compare two complementary label-free detectors: a global Mahalanobis estimator fit on unlabeled latent representations, and ReSCOPED, a lightweight, diffusion-based typicality estimator operating on the same features at a local level. Despite their different detection mechanisms, representation scaling reveals a consistent regime-dependent pattern: both local and global detectors' absolute performance improves with better representation quality, and performance gaps between the two detectors disappear across both language and vision tasks as representations scale. These results suggest that label-free OOD detection depends strongly on the geometry exposed by frozen pretrained backbones, reducing the importance of detector choice as backbone scale increases and enabling efficient deployment directly on frozen models.

Problem

Research questions and friction points this paper is trying to address.

out-of-distribution detection

pretrained representations

label-free

distribution shift

frozen models

Innovation

Methods, ideas, or system contributions that make the work stand out.

label-free OOD detection

pretrained representations

representation scaling