Variance Norms for Kernelized Anomaly Detection

📅 2024-07-16
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing anomaly detection methods in separable Banach spaces lack basis-independent, data-driven distance metrics. Method: We propose the Variance Norm—a generalization of the Mahalanobis distance—that relies solely on the inner product structure of a Hilbert space, without assuming a predefined coordinate system or invertibility of the covariance operator. Theoretically, we extend Cameron–Martin theory to non-Gaussian probability measures and rigorously establish the well-definedness of this norm in reproducing kernel Hilbert spaces (RKHS). Methodologically, we develop a kernelized nearest-neighbor Mahalanobis distance for semi-supervised time-series novelty detection. Results: Experiments on 12 real-world datasets demonstrate significant improvements over state-of-the-art baselines. We further prove statistical consistency of the Variance Norm in finite-dimensional Gaussian settings, confirming its theoretical soundness and practical efficacy.

Technology Category

Application Category

📝 Abstract
We present a unified theory for Mahalanobis-type anomaly detection on Banach spaces, using ideas from Cameron-Martin theory applied to non-Gaussian measures. This approach leads to a basis-free, data-driven notion of anomaly distance through the so-called variance norm of a probability measure, which can be consistently estimated using empirical measures. Our framework generalizes the classical $mathbb{R}^d$, functional $(L^2[0,1])^d$, and kernelized settings, including the general case of non-injective covariance operator. We prove that the variance norm depends solely on the inner product in a given Hilbert space, and hence that the kernelized Mahalanobis distance can naturally be recovered by working on reproducing kernel Hilbert spaces. Using the variance norm, we introduce the notion of a kernelized nearest-neighbour Mahalanobis distance for semi-supervised anomaly detection. In an empirical study on 12 real-world datasets, we demonstrate that the kernelized nearest-neighbour Mahalanobis distance outperforms the traditional kernelized Mahalanobis distance for multivariate time series anomaly detection, using state-of-the-art time series kernels such as the signature, global alignment, and Volterra reservoir kernels. Moreover, we provide an initial theoretical justification of nearest-neighbour Mahalanobis distances by developing concentration inequalities in the finite-dimensional Gaussian case.
Problem

Research questions and friction points this paper is trying to address.

Extend Mahalanobis distance to separable Banach spaces
Generalize covariance operators to non-injective cases
Improve novelty detection with kernelized nearest-neighbour method
Innovation

Methods, ideas, or system contributions that make the work stand out.

Extends Mahalanobis distance to Banach spaces
Introduces kernelized nearest-neighbour Mahalanobis distance
Uses variance norm for anomaly detection
🔎 Similar Papers
2024-05-29arXiv.orgCitations: 0