🤖 AI Summary
This work addresses the instability of existing out-of-distribution (OOD) detection methods across models and datasets, which stems from the coupling of confidence and distributional signals in the feature space. The authors propose an orthogonal feature decomposition approach that disentangles penultimate-layer features into a classifier-aligned confidence component and a residual component. They further reveal, for the first time, that the residual subspace contains class-specific directional distributional signals. By scoring these components independently and fusing them via normalized combination—rendering their failure modes approximately independent—the method substantially enhances generalization robustness. Evaluated across five architectures and five benchmark settings, the approach achieves state-of-the-art or leading performance, ranking first in three settings and attaining the highest overall AUROC, with negligible computational overhead.
📝 Abstract
Out-of-distribution (OOD) detection is essential for deploying deep learning models reliably, yet no single method performs consistently across architectures and datasets -- a scorer that leads on one benchmark often falters on another. We attribute this inconsistency to a shared structural limitation: logit-based methods see only the classifier's confidence signal, while feature-based methods attempt to measure membership in the training distribution but do so in the full feature space where confidence and membership are entangled, inheriting architecture-sensitive failure modes. We observe that penultimate features naturally decompose into two orthogonal subspaces: a classifier-aligned component encoding confidence, and a residual the classifier discards. We discover that this residual carries a class-specific directional signature for in-distribution data -- a membership signal invisible to logit-based methods and entangled with noise in feature-based methods. We propose CORE (COnfidence + REsidual), which disentangles the two signals by scoring each subspace independently and combines them via normalized summation. Because the two signals are orthogonal by construction, their failure modes are approximately independent, producing robust detection where either view alone is unreliable. CORE achieves competitive or state-of-the-art performance across five architectures and five benchmark configurations, ranking first in three of five settings and achieving the highest grand average AUROC with negligible computational overhead.