🤖 AI Summary
In semi-supervised anomaly detection, real-world applications often require directional sensitivity—e.g., detecting only high-value anomalies—yet conventional symmetric distance metrics fail to encode the domain constraint that “good” and “bad” attribute deviations are non-cancellable. Method: We propose two novel asymmetric distance metrics: (i) ramp distance, which ensures directional sensitivity and robustness via a piecewise-linear penalty; and (ii) signed distance, which explicitly encodes the sign of deviation. Ramp distance serves as the primary method, while signed distance is effective on synthetic data but degrades significantly under real-world noise. Contribution/Results: This work is the first to formalize monotonicity constraints in anomaly detection. Evaluated across multiple real-world datasets, ramp distance matches or outperforms absolute distance, offering an interpretable, deployable paradigm for business-semantic–driven directional anomaly identification.
📝 Abstract
Semi-supervised anomaly detection is based on the principle that potential anomalies are those records that look different from normal training data. However, in some cases we are specifically interested in anomalies that correspond to high attribute values (or low, but not both). We present two asymmetrical distance measures that take this directionality into account: ramp distance and signed distance. Through experiments on synthetic and real-life datasets we show that ramp distance performs as well or better than the absolute distance traditionally used in anomaly detection. While signed distance also performs well on synthetic data, it performs substantially poorer on real-life datasets. We argue that this reflects the fact that in practice, good scores on some attributes should not be allowed to compensate for bad scores on others.