Deciphering the Definition of Adversarial Robustness for post-hoc OOD Detectors

📅 2024-06-21
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the lack of adversarial robustness in post-hoc out-of-distribution (OOD) detectors under adversarial examples (AdEx). We systematically evaluate 16 state-of-the-art methods against FGSM, PGD, and CW attacks, revealing widespread failure. To address this gap, we propose the first formally defined evaluation standard for adversarial robustness of OOD detection and introduce a hierarchical defense framework: Level 1 establishes a unified adversarial benchmark—integrating the OpenOOD platform with diverse attacks, models, and datasets—to reliably expose detector vulnerabilities; the final layer incorporates an adaptive attack-defense mechanism. Empirical results show that most existing detectors suffer severe performance degradation without protection. Level 1 evaluation is reproducible and scalable, providing a standardized assessment paradigm for secure OOD detector deployment and actionable pathways for robustness improvement.

Technology Category

Application Category

📝 Abstract
Detecting out-of-distribution (OOD) inputs is critical for safely deploying deep learning models in real-world scenarios. In recent years, many OOD detectors have been developed, and even the benchmarking has been standardized, i.e. OpenOOD. The number of post-hoc detectors is growing fast. They are showing an option to protect a pre-trained classifier against natural distribution shifts and claim to be ready for real-world scenarios. However, its effectiveness in dealing with adversarial examples (AdEx) has been neglected in most studies. In cases where an OOD detector includes AdEx in its experiments, the lack of uniform parameters for AdEx makes it difficult to accurately evaluate the performance of the OOD detector. This paper investigates the adversarial robustness of 16 post-hoc detectors against various evasion attacks. It also discusses a roadmap for adversarial defense in OOD detectors that would help adversarial robustness. We believe that level 1 (AdEx on a unified dataset) should be added to any OOD detector to see the limitations. The last level in the roadmap (defense against adaptive attacks) we added for integrity from an adversarial machine learning (AML) point of view, which we do not believe is the ultimate goal for OOD detectors.
Problem

Research questions and friction points this paper is trying to address.

Deep Learning
Out-of-Distribution Detection
Adversarial Examples
Innovation

Methods, ideas, or system contributions that make the work stand out.

OOD Detection
Adversarial Examples
Robustness Enhancement
🔎 Similar Papers
P
Peter Lorenz
Fraunhofer ITWM, Germany; Heidelberg University, Germany
M
Mario Fernandez
Fraunhofer ITWM, Germany; École Normale Supérieure, PSL University, France
Jens Müller
Jens Müller
Heidelberg University, Germany
Ullrich Köthe
Ullrich Köthe
Adjunct Professor of Computer Science, University of Heidelberg
Image AnalysisMachine LearningScientific Software