Why the Counterintuitive Phenomenon of Likelihood Rarely Appears in Tabular Anomaly Detection with Deep Generative Models?

📅 2026-02-10

📈 Citations: 0

✨ Influential: 0

career value

149K/year

🤖 AI Summary

This study investigates why the counterintuitive phenomenon—where anomalous samples receive higher likelihood scores than inliers—is rarely observed when applying deep generative models to tabular anomaly detection. By formalizing a general definition of this phenomenon, we systematically evaluate its occurrence across 47 tabular datasets and 10 computer vision/natural language processing embedding datasets, complemented by theoretical and empirical analyses linking its prevalence to data dimensionality and feature correlations. Leveraging invertible flow-based models with exact likelihoods, large-scale experiments on the ADBench benchmark compare against 13 baseline methods and demonstrate that this phenomenon is indeed uncommon in tabular data, thereby validating the reliability and practicality of using likelihood scores alone for anomaly detection in the tabular domain.

Technology Category

Application Category

📝 Abstract

Deep generative models with tractable and analytically computable likelihoods, exemplified by normalizing flows, offer an effective basis for anomaly detection through likelihood-based scoring. We demonstrate that, unlike in the image domain where deep generative models frequently assign higher likelihoods to anomalous data, such counterintuitive behavior occurs far less often in tabular settings. We first introduce a domain-agnostic formulation that enables consistent detection and evaluation of the counterintuitive phenomenon, addressing the absence of precise definition. Through extensive experiments on 47 tabular datasets and 10 CV/NLP embedding datasets in ADBench, benchmarked against 13 baseline models, we demonstrate that the phenomenon, as defined, is consistently rare in general tabular data. We further investigate this phenomenon from both theoretical and empirical perspectives, focusing on the roles of data dimensionality and difference in feature correlation. Our results suggest that likelihood-only detection with normalizing flows offers a practical and reliable approach for anomaly detection in tabular domains.

Problem

Research questions and friction points this paper is trying to address.

tabular anomaly detection

deep generative models

likelihood

counterintuitive phenomenon

normalizing flows

Innovation

Methods, ideas, or system contributions that make the work stand out.

likelihood-based anomaly detection

normalizing flows

counterintuitive phenomenon