On the Adversarial Robustness of Learning-based Conformal Novelty Detection

📅 2025-09-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates the robustness of AdaDetect—a learning-based novelty detection framework with finite-sample false discovery rate (FDR) control guarantees—under adversarial perturbations. We identify that its statistical FDR guarantee may break down under attack, and propose the first oracle attack model tailored to conformal novelty detection, deriving a tight theoretical upper bound on FDR degradation. We further design a practical black-box attack algorithm requiring only label queries, integrating two mainstream query-based adversarial strategies. Extensive evaluations on synthetic and real-world datasets demonstrate that adversarial perturbations can severely violate FDR control while preserving high detection power. Our findings expose a fundamental fragility of existing statistical guarantees in adversarial settings, establishing both a theoretical benchmark and empirical foundation for developing robust, controllable novelty detection methods.

Technology Category

Application Category

📝 Abstract
This paper studies the adversarial robustness of conformal novelty detection. In particular, we focus on AdaDetect, a powerful learning-based framework for novelty detection with finite-sample false discovery rate (FDR) control. While AdaDetect provides rigorous statistical guarantees under benign conditions, its behavior under adversarial perturbations remains unexplored. We first formulate an oracle attack setting that quantifies the worst-case degradation of FDR, deriving an upper bound that characterizes the statistical cost of attacks. This idealized formulation directly motivates a practical and effective attack scheme that only requires query access to AdaDetect's output labels. Coupling these formulations with two popular and complementary black-box adversarial algorithms, we systematically evaluate the vulnerability of AdaDetect on synthetic and real-world datasets. Our results show that adversarial perturbations can significantly increase the FDR while maintaining high detection power, exposing fundamental limitations of current error-controlled novelty detection methods and motivating the development of more robust alternatives.
Problem

Research questions and friction points this paper is trying to address.

Studying adversarial robustness of learning-based conformal novelty detection methods
Analyzing vulnerability of AdaDetect framework under adversarial perturbations
Demonstrating how attacks increase false discovery rate while maintaining detection power
Innovation

Methods, ideas, or system contributions that make the work stand out.

Formulated oracle attack quantifying FDR degradation
Developed practical query-based attack on output labels
Evaluated vulnerability using black-box adversarial algorithms
D
Daofu Zhang
Department of Electrical and Computer Engineering, University of Utah
Mehrdad Pournaderi
Mehrdad Pournaderi
University of Utah
statisticssignal processing
H
Hanne M. Clifford
Department of Electrical Engineering and Computer Science, Syracuse University
Y
Yu Xiang
Department of Electrical and Computer Engineering, University of Utah
P
Pramod K. Varshney
Department of Electrical Engineering and Computer Science, Syracuse University