🤖 AI Summary
Real-world unsupervised anomaly detection often suffers severe performance degradation when training data contain unlabeled or mislabeled anomalies. Existing robust methods typically require access to training procedures, strong data priors, or assumptions about anomaly prevalence—limiting practical applicability. To address this, we propose EPHAD: the first training-agnostic, test-time post-processing adaptive framework for unsupervised anomaly detection. EPHAD dynamically refines detector outputs by fusing heterogeneous external evidence—including multimodal foundation models (e.g., CLIP) and classical detectors (e.g., Latent Outlier Factor)—without modifying the original model architecture or training pipeline. Its core innovations are a principled evidence integration mechanism and a Bayesian posterior adjustment strategy. Evaluated on 8 vision, 26 tabular, and 1 industrial dataset, EPHAD consistently enhances robustness and generalization across diverse detectors under varying contamination levels, demonstrating strong universality and plug-and-play compatibility.
📝 Abstract
Unsupervised anomaly detection (AD) methods typically assume clean training data, yet real-world datasets often contain undetected or mislabeled anomalies, leading to significant performance degradation. Existing solutions require access to the training pipelines, data or prior knowledge of the proportions of anomalies in the data, limiting their real-world applicability. To address this challenge, we propose EPHAD, a simple yet effective test-time adaptation framework that updates the outputs of AD models trained on contaminated datasets using evidence gathered at test time. Our approach integrates the prior knowledge captured by the AD model trained on contaminated datasets with evidence derived from multimodal foundation models like Contrastive Language-Image Pre-training (CLIP), classical AD methods like the Latent Outlier Factor or domain-specific knowledge. We illustrate the intuition behind EPHAD using a synthetic toy example and validate its effectiveness through comprehensive experiments across eight visual AD datasets, twenty-six tabular AD datasets, and a real-world industrial AD dataset. Additionally, we conduct an ablation study to analyse hyperparameter influence and robustness to varying contamination levels, demonstrating the versatility and robustness of EPHAD across diverse AD models and evidence pairs. To ensure reproducibility, our code is publicly available at https://github.com/sukanyapatra1997/EPHAD.