Investigating Mask-aware Prototype Learning for Tabular Anomaly Detection

📅 2025-06-03

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

Table anomaly detection suffers from feature entanglement and insufficient modeling of global inter-field dependencies. To address this, we propose a masked-aware disentangled representation learning framework coupled with explicit prototype extraction. First, disentanglement is achieved via parallel masked modeling and orthogonal basis projection. Second, projection-space learning and prototype construction are jointly formulated as an optimal transport problem, where a calibrated distance metric is introduced to refine anomaly scoring. To the best of our knowledge, this is the first work to incorporate optimal transport into prototype-based learning for enhancing discriminability of normal patterns. Evaluated on 20 standard tabular datasets, our method achieves significant improvements over state-of-the-art approaches—yielding an average F1-score gain of 3.2%—while maintaining strong interpretability: each prototype is semantically grounded in a coherent, normal subpopulation.

Technology Category

Application Category

📝 Abstract

Tabular anomaly detection, which aims at identifying deviant samples, has been crucial in a variety of real-world applications, such as medical disease identification, financial fraud detection, intrusion monitoring, etc. Although recent deep learning-based methods have achieved competitive performances, these methods suffer from representation entanglement and the lack of global correlation modeling, which hinders anomaly detection performance. To tackle the problem, we incorporate mask modeling and prototype learning into tabular anomaly detection. The core idea is to design learnable masks by disentangled representation learning within a projection space and extracting normal dependencies as explicit global prototypes. Specifically, the overall model involves two parts: (i) During encoding, we perform mask modeling in both the data space and projection space with orthogonal basis vectors for learning shared disentangled normal patterns; (ii) During decoding, we decode multiple masked representations in parallel for reconstruction and learn association prototypes to extract normal characteristic correlations. Our proposal derives from a distribution-matching perspective, where both projection space learning and association prototype learning are formulated as optimal transport problems, and the calibration distances are utilized to refine the anomaly scores. Quantitative and qualitative experiments on 20 tabular benchmarks demonstrate the effectiveness and interpretability of our model.

Problem

Research questions and friction points this paper is trying to address.

Addressing representation entanglement in tabular anomaly detection

Improving global correlation modeling for anomaly detection

Integrating mask modeling and prototype learning techniques

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mask modeling for disentangled representation learning

Prototype learning to capture global correlations

Optimal transport for distribution-matching calibration

🔎 Similar Papers

No similar papers found.

Base Pay Range: $48.00 - $58.00 per hour based on pursuit of a Masters and Ph.D.

USA-CA-Milpitas-KLA

Machine Learning Engineer