🤖 AI Summary
To address the problem that globally rare events in large-scale tabular data are misclassified as normal within local heterogeneous contexts (e.g., distinct users), this paper proposes the first context-aware unsupervised anomaly detection framework specifically designed for tabular data. The method automatically identifies salient contextual features and models the conditional data distribution using a deep autoencoder—eschewing global distributional assumptions to precisely capture locally normal patterns. Crucially, it innovatively incorporates a context-conditioning mechanism into tabular anomaly detection, enabling end-to-end adaptive learning. Extensive experiments on multiple standard tabular benchmarks demonstrate significant improvements over existing state-of-the-art methods. Results empirically validate that explicit contextual modeling is essential for accurate anomaly detection and yields substantial performance gains.
📝 Abstract
Anomaly detection is critical in domains such as cybersecurity and finance, especially when working with large-scale tabular data. Yet, unsupervised anomaly detection -- where no labeled anomalies are available -- remains a significant challenge. Although various deep learning methods have been proposed to model a dataset's joint distribution, real-world tabular data often contain heterogeneous contexts (e.g., different users), making globally rare events normal under certain contexts. Consequently, relying on a single global distribution can overlook these contextual nuances, degrading detection performance. In this paper, we present a context-conditional anomaly detection framework tailored for tabular datasets. Our approach automatically identifies context features and models the conditional data distribution using a simple deep autoencoder. Extensive experiments on multiple tabular benchmark datasets demonstrate that our method outperforms state-of-the-art approaches, underscoring the importance of context in accurately distinguishing anomalous from normal instances.