Directional anomaly detection

📅 2024-10-30
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In semi-supervised anomaly detection, real-world applications often require directional sensitivity—e.g., detecting only high-value anomalies—yet conventional symmetric distance metrics fail to encode the domain constraint that “good” and “bad” attribute deviations are non-cancellable. Method: We propose two novel asymmetric distance metrics: (i) ramp distance, which ensures directional sensitivity and robustness via a piecewise-linear penalty; and (ii) signed distance, which explicitly encodes the sign of deviation. Ramp distance serves as the primary method, while signed distance is effective on synthetic data but degrades significantly under real-world noise. Contribution/Results: This work is the first to formalize monotonicity constraints in anomaly detection. Evaluated across multiple real-world datasets, ramp distance matches or outperforms absolute distance, offering an interpretable, deployable paradigm for business-semantic–driven directional anomaly identification.

Technology Category

Application Category

📝 Abstract
Semi-supervised anomaly detection is based on the principle that potential anomalies are those records that look different from normal training data. However, in some cases we are specifically interested in anomalies that correspond to high attribute values (or low, but not both). We present two asymmetrical distance measures that take this directionality into account: ramp distance and signed distance. Through experiments on synthetic and real-life datasets we show that ramp distance performs as well or better than the absolute distance traditionally used in anomaly detection. While signed distance also performs well on synthetic data, it performs substantially poorer on real-life datasets. We argue that this reflects the fact that in practice, good scores on some attributes should not be allowed to compensate for bad scores on others.
Problem

Research questions and friction points this paper is trying to address.

Detecting anomalies with high or low attribute values specifically
Introducing asymmetrical distance measures for monotonic anomaly detection
Evaluating ramp and signed distance performance on synthetic and real datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses ramp distance for anomaly detection
Incorporates signed distance for anomalies
Focuses on high or low attribute values
🔎 Similar Papers
No similar papers found.
O
Oliver Urs Lenz
Leiden Institute of Advanced Computer Science, Leiden University; Research Group for Computational Web Intelligence, Department of Applied Mathematics, Computer Science and Statistics, Ghent University
Matthijs van Leeuwen
Matthijs van Leeuwen
Associate Professor, Leiden University
Data miningpattern mininginformation theorymachine learningartificial intelligence