🤖 AI Summary
This work addresses structured anomaly detection—identifying samples that deviate from the intrinsic low-dimensional manifold governing data regularity. We propose a novel preference-embedding isolation framework: (1) data are first mapped into a high-dimensional preference space to enhance structural separability; (2) three adaptive isolation strategies are introduced—Voronoi-iForest (theoretically universal), RuzHash-iForest (LSH-accelerated), and Sliding-PIF (sliding-window-guided local prior). To our knowledge, this is the first approach unifying preference embedding with isolation mechanisms. Evaluated on diverse structured datasets, it achieves an average 12.7% AUC improvement over state-of-the-art methods—including iForest, LOF, and leading deep anomaly detectors—while accelerating inference by 3.2×. The framework thus advances both detection accuracy and computational efficiency for structured anomaly detection.
📝 Abstract
We address the problem of detecting anomalies as samples that do not conform to structured patterns represented by low-dimensional manifolds. To this end, we conceive a general anomaly detection framework called Preference Isolation Forest (PIF), that combines the benefits of adaptive isolation-based methods with the flexibility of preference embedding. The key intuition is to embed the data into a high-dimensional preference space by fitting low-dimensional manifolds, and to identify anomalies as isolated points. We propose three isolation approaches to identify anomalies: $i$) Voronoi-iForest, the most general solution, $ii$) RuzHash-iForest, that avoids explicit computation of distances via Local Sensitive Hashing, and $iii$) Sliding-PIF, that leverages a locality prior to improve efficiency and effectiveness.