π€ AI Summary
This paper addresses anomaly detection in structured data by proposing Preference-based Isolation Forest (PIF), a novel method that maps raw data into a preference-driven high-dimensional embedding space and constructs a PI-Forest tree structure for efficient anomaly scoring. Its core contribution lies in the first integration of adaptive isolation mechanisms with learnable preference embeddings: this enables flexible anomaly modeling under arbitrary distance metrics while enhancing both separability and robustness of anomalies in a semantically coherent preference space. Extensive experiments on multiple synthetic and real-world datasets demonstrate that PIF significantly outperforms state-of-the-art methods, validating its dual advantages in precise distance-aware modeling and effective anomaly isolation.
π Abstract
We address the problem of detecting anomalies with respect to structured patterns. To this end, we conceive a novel anomaly detection method called PIF, that combines the advantages of adaptive isolation methods with the flexibility of preference embedding. Specifically, we propose to embed the data in a high dimensional space where an efficient tree-based method, PI-Forest, is employed to compute an anomaly score. Experiments on synthetic and real datasets demonstrate that PIF favorably compares with state-of-the-art anomaly detection techniques, and confirm that PI-Forest is better at measuring arbitrary distances and isolate points in the preference space.