π€ AI Summary
This work addresses the challenge of dependency-violation-based anomaly detection in high-dimensional tabular data, where complex feature dependencies and heterogeneous noise hinder performance. To tackle this, the authors propose a dependency-driven framework based on Prior-Data Fitted Networks (PFNs). For the first time, a frozen TabPFN is leveraged to model high-dimensional dependencies, enabling the identification of conditional dependency violations within the learned latent space. An uncertainty-aware scoring mechanism is further introduced to enhance robustness and scalability. Extensive experiments across 57 datasets from ADBench demonstrate that the proposed method achieves the best average ranking in medium- to high-dimensional settings, improving ROC-AUC by nearly 20% over the average baseline and by approximately 2.8% over the strongest baseline in high-dimensional scenarios, thereby outperforming existing state-of-the-art approaches.
π Abstract
Anomaly detection in tabular data is challenging due to high dimensionality, complex feature dependencies, and heterogeneous noise. Many existing methods rely on proximity-based cues and may miss anomalies caused by violations of complex feature dependencies. Dependency-based anomaly detection provides a principled alternative by identifying anomalies as violations of dependencies among features. However, existing methods often struggle to model such dependencies robustly and to scale to high-dimensional data with complex dependency structures. To address these challenges, we propose uLEAD-TabPFN, a dependency-based anomaly detection framework built on Prior-Data Fitted Networks (PFNs). uLEAD-TabPFN identifies anomalies as violations of conditional dependencies in a learned latent space, leveraging frozen PFNs for dependency estimation. Combined with uncertainty-aware scoring, the proposed framework enables robust and scalable anomaly detection. Experiments on 57 tabular datasets from ADBench show that uLEAD-TabPFN achieves particularly strong performance in medium- and high-dimensional settings, where it attains the top average rank. On high-dimensional datasets, uLEAD-TabPFN improves the average ROC-AUC by nearly 20\% over the average baseline and by approximately 2.8\% over the best-performing baseline, while maintaining overall superior performance compared to state-of-the-art methods. Further analysis shows that uLEAD-TabPFN provides complementary anomaly detection capability, achieving strong performance on datasets where many existing methods struggle.