Statistical Inference for Autoencoder-based Anomaly Detection after Representation Learning-based Domain Adaptation

📅 2025-08-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the lack of statistical reliability in autoencoder-based anomaly detection after domain adaptation—particularly when target-domain samples are scarce and adaptation introduces uncertainty, hindering valid p-value computation and false positive rate (FPR) control—this paper proposes STAND-DA. STAND-DA is the first framework to systematically integrate selective inference into deep anomaly detection post-domain adaptation. Building upon representation-learning-based domain adaptation, it provides rigorous statistical inference for autoencoder reconstructions, enabling closed-form p-value computation and theoretical FPR control. The method constructs a provably valid test statistic and develops a GPU-accelerated algorithm to enhance scalability for large models. Experiments on synthetic and real-world benchmarks demonstrate STAND-DA’s statistical validity, strict FPR control under finite target samples, and substantial computational speedup over baseline approaches.

Technology Category

Application Category

📝 Abstract
Anomaly detection (AD) plays a vital role across a wide range of domains, but its performance might deteriorate when applied to target domains with limited data. Domain Adaptation (DA) offers a solution by transferring knowledge from a related source domain with abundant data. However, this adaptation process can introduce additional uncertainty, making it difficult to draw statistically valid conclusions from AD results. In this paper, we propose STAND-DA -- a novel framework for statistically rigorous Autoencoder-based AD after Representation Learning-based DA. Built on the Selective Inference (SI) framework, STAND-DA computes valid $p$-values for detected anomalies and rigorously controls the false positive rate below a pre-specified level $α$ (e.g., 0.05). To address the computational challenges of applying SI to deep learning models, we develop the GPU-accelerated SI implementation, significantly enhancing both scalability and runtime performance. This advancement makes SI practically feasible for modern, large-scale deep architectures. Extensive experiments on synthetic and real-world datasets validate the theoretical results and computational efficiency of the proposed STAND-DA method.
Problem

Research questions and friction points this paper is trying to address.

Improves anomaly detection in target domains with limited data
Addresses uncertainty in domain adaptation for valid statistical conclusions
Enhances computational feasibility for deep learning-based anomaly detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Autoencoder-based anomaly detection with domain adaptation
Selective Inference for valid p-values computation
GPU-accelerated SI for scalable deep learning
🔎 Similar Papers
2024-04-17IEEE Transactions on Knowledge and Data EngineeringCitations: 0
T
Tran Tuan Kiet
University of Information Technology, Ho Chi Minh City, Vietnam; Vietnam National University, Ho Chi Minh City, Vietnam
N
Nguyen Thang Loi
University of Information Technology, Ho Chi Minh City, Vietnam; Vietnam National University, Ho Chi Minh City, Vietnam
Vo Nguyen Le Duy
Vo Nguyen Le Duy
Lecturer at University of Information Technology / Visiting Scientist at RIKEN
Machine LearningData ScienceStatistics