Statistical Inference for Autoencoder-based Anomaly Detection after Representation Learning-based Domain Adaptation

📅 2025-08-09

📈 Citations: 0

✨ Influential: 0

career value

229K/year

🤖 AI Summary

To address the lack of statistical reliability in autoencoder-based anomaly detection after domain adaptation—particularly when target-domain samples are scarce and adaptation introduces uncertainty, hindering valid p-value computation and false positive rate (FPR) control—this paper proposes STAND-DA. STAND-DA is the first framework to systematically integrate selective inference into deep anomaly detection post-domain adaptation. Building upon representation-learning-based domain adaptation, it provides rigorous statistical inference for autoencoder reconstructions, enabling closed-form p-value computation and theoretical FPR control. The method constructs a provably valid test statistic and develops a GPU-accelerated algorithm to enhance scalability for large models. Experiments on synthetic and real-world benchmarks demonstrate STAND-DA’s statistical validity, strict FPR control under finite target samples, and substantial computational speedup over baseline approaches.

Technology Category

Application Category

📝 Abstract

Anomaly detection (AD) plays a vital role across a wide range of domains, but its performance might deteriorate when applied to target domains with limited data. Domain Adaptation (DA) offers a solution by transferring knowledge from a related source domain with abundant data. However, this adaptation process can introduce additional uncertainty, making it difficult to draw statistically valid conclusions from AD results. In this paper, we propose STAND-DA -- a novel framework for statistically rigorous Autoencoder-based AD after Representation Learning-based DA. Built on the Selective Inference (SI) framework, STAND-DA computes valid $p$-values for detected anomalies and rigorously controls the false positive rate below a pre-specified level $α$ (e.g., 0.05). To address the computational challenges of applying SI to deep learning models, we develop the GPU-accelerated SI implementation, significantly enhancing both scalability and runtime performance. This advancement makes SI practically feasible for modern, large-scale deep architectures. Extensive experiments on synthetic and real-world datasets validate the theoretical results and computational efficiency of the proposed STAND-DA method.

Problem

Research questions and friction points this paper is trying to address.

Improves anomaly detection in target domains with limited data

Addresses uncertainty in domain adaptation for valid statistical conclusions

Enhances computational feasibility for deep learning-based anomaly detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Autoencoder-based anomaly detection with domain adaptation

Selective Inference for valid p-values computation

GPU-accelerated SI for scalable deep learning

🔎 Similar Papers

DACAD: Domain Adaptation Contrastive Learning for Anomaly Detection in Multivariate Time Series