Variance Norms for Kernelized Anomaly Detection

📅 2024-07-16

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

Existing anomaly detection methods in separable Banach spaces lack basis-independent, data-driven distance metrics. Method: We propose the Variance Norm—a generalization of the Mahalanobis distance—that relies solely on the inner product structure of a Hilbert space, without assuming a predefined coordinate system or invertibility of the covariance operator. Theoretically, we extend Cameron–Martin theory to non-Gaussian probability measures and rigorously establish the well-definedness of this norm in reproducing kernel Hilbert spaces (RKHS). Methodologically, we develop a kernelized nearest-neighbor Mahalanobis distance for semi-supervised time-series novelty detection. Results: Experiments on 12 real-world datasets demonstrate significant improvements over state-of-the-art baselines. We further prove statistical consistency of the Variance Norm in finite-dimensional Gaussian settings, confirming its theoretical soundness and practical efficacy.

Technology Category

Application Category

📝 Abstract

We present a unified theory for Mahalanobis-type anomaly detection on Banach spaces, using ideas from Cameron-Martin theory applied to non-Gaussian measures. This approach leads to a basis-free, data-driven notion of anomaly distance through the so-called variance norm of a probability measure, which can be consistently estimated using empirical measures. Our framework generalizes the classical $mathbb{R}^d$, functional $(L^2[0,1])^d$, and kernelized settings, including the general case of non-injective covariance operator. We prove that the variance norm depends solely on the inner product in a given Hilbert space, and hence that the kernelized Mahalanobis distance can naturally be recovered by working on reproducing kernel Hilbert spaces. Using the variance norm, we introduce the notion of a kernelized nearest-neighbour Mahalanobis distance for semi-supervised anomaly detection. In an empirical study on 12 real-world datasets, we demonstrate that the kernelized nearest-neighbour Mahalanobis distance outperforms the traditional kernelized Mahalanobis distance for multivariate time series anomaly detection, using state-of-the-art time series kernels such as the signature, global alignment, and Volterra reservoir kernels. Moreover, we provide an initial theoretical justification of nearest-neighbour Mahalanobis distances by developing concentration inequalities in the finite-dimensional Gaussian case.

Problem

Research questions and friction points this paper is trying to address.

Extend Mahalanobis distance to separable Banach spaces

Generalize covariance operators to non-injective cases

Improve novelty detection with kernelized nearest-neighbour method

Innovation

Methods, ideas, or system contributions that make the work stand out.

Extends Mahalanobis distance to Banach spaces

Introduces kernelized nearest-neighbour Mahalanobis distance

Uses variance norm for anomaly detection

🔎 Similar Papers

Anomaly Detection by Context Contrasting