Kurtosis-Guided Denoising Score Matching for Tabular Anomaly Detection

📅 2026-05-07

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

This work addresses the instability and insufficient sensitivity of denoising score matching (DSM) in tabular anomaly detection, particularly under scenarios lacking validation sets or label information, where selecting an appropriate perturbation scale is challenging. To overcome this, the authors propose K-DSM, a method that adaptively assigns a noise level to each feature based on its kurtosis, enabling efficient single-scale DSM training. Additionally, an exponential moving average (EMA) teacher filtering mechanism is introduced to mitigate data contamination. The framework eliminates the need for multi-scale or noise-conditioned training, substantially reducing hyperparameter dependence while enhancing coverage in low-density regions and discrimination accuracy in high-density regions. Experimental results demonstrate that K-DSM achieves state-of-the-art performance under semi-supervised settings and remains highly effective even in fully unsupervised scenarios with contaminated data.

📝 Abstract

Denoising score matching (DSM) provides a way to learn data distributions by training a neural network to recover the score function, defined as the gradient of the log density, from noise-corrupted samples. Once trained, the score magnitude at a test point reflects how consistent that point is with the learned distribution, making it a natural anomaly signal. The key practical challenge is selecting the perturbation scale: too little noise yields unstable score estimates in sparse regions, while too much erases local structure and weakens anomaly sensitivity. This is compounded by the difficulty of hyperparameter tuning when anomalies are unknown and no validation set is available. We introduce kurtosis-based noise scaling (K-DSM), a per-feature scheme that sets noise levels from the shape of each marginal distribution, improving coverage of low-density regions and precision in high-density regions without extra model complexity. Contrary to prior claims that multi-scale or noise-conditioned training is necessary, we find that a carefully trained single-scale model is already a strong anomaly detector. On standard tabular anomaly detection benchmarks, K-DSM achieves state-of-the-art performance in the semi-supervised setting. When combined with a lightweight EMA-teacher filtering rule that removes low-density training points before each gradient step, it also achieves strong performance in the fully unsupervised (contaminated) setting, suggesting that simple, data-adaptive noise scaling enables robust anomaly detection while reducing reliance on hyperparameter tuning.

Problem

Research questions and friction points this paper is trying to address.

anomaly detection

denoising score matching

noise scaling

tabular data

hyperparameter tuning

Innovation

Methods, ideas, or system contributions that make the work stand out.

kurtosis-based noise scaling

denoising score matching

tabular anomaly detection