🤖 AI Summary
To address the poor scalability, weak interpretability, and lack of theoretical robustness guarantees in semi-supervised anomaly detection for tabular data, this paper proposes Time-Conditioned Contraction Matching (TCCM). TCCM directly models the dynamic evolution of normal data in the input space by learning a time-conditioned origin-contracting vector field, bypassing expensive numerical ODE solvers. Its key contributions are: (1) a one-step flow matching mechanism enabling efficient training and inference; (2) a single-step deviation scoring function that drastically accelerates anomaly detection; and (3) theoretically grounded robustness guarantees via Lipschitz continuity, coupled with feature-level interpretability. Evaluated on the ADBench benchmark, TCCM achieves state-of-the-art performance on high-dimensional and large-scale tabular datasets, simultaneously delivering superior detection accuracy and inference efficiency.
📝 Abstract
We introduce Time-Conditioned Contraction Matching (TCCM), a novel method for semi-supervised anomaly detection in tabular data. TCCM is inspired by flow matching, a recent generative modeling framework that learns velocity fields between probability distributions and has shown strong performance compared to diffusion models and generative adversarial networks. Instead of directly applying flow matching as originally formulated, TCCM builds on its core idea -- learning velocity fields between distributions -- but simplifies the framework by predicting a time-conditioned contraction vector toward a fixed target (the origin) at each sampled time step. This design offers three key advantages: (1) a lightweight and scalable training objective that removes the need for solving ordinary differential equations during training and inference; (2) an efficient scoring strategy called one time-step deviation, which quantifies deviation from expected contraction behavior in a single forward pass, addressing the inference bottleneck of existing continuous-time models such as DTE (a diffusion-based model with leading anomaly detection accuracy but heavy inference cost); and (3) explainability and provable robustness, as the learned velocity field operates directly in input space, making the anomaly score inherently feature-wise attributable; moreover, the score function is Lipschitz-continuous with respect to the input, providing theoretical guarantees under small perturbations. Extensive experiments on the ADBench benchmark show that TCCM strikes a favorable balance between detection accuracy and inference cost, outperforming state-of-the-art methods -- especially on high-dimensional and large-scale datasets. The source code is available at our GitHub repository.