Online Isolation Forest

📅 2025-05-14

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

Existing anomaly detection methods suffer from a dual limitation: offline approaches cannot process data streams in real time, while online methods typically rely on periodic retraining or storage of historical data. This paper introduces Online-iForest—the first fully online variant of Isolation Forest—capable of incremental updates via a single pass over streaming data, without retaining historical samples or performing retraining. Its core innovations include a dynamic construction mechanism for random-split trees, a node-weight decay strategy, and synergistic modeling of concept drift using a sliding time window. Evaluated on real-world streaming datasets, Online-iForest achieves detection accuracy comparable to state-of-the-art offline methods, while significantly outperforming all online baselines in inference speed. It demonstrates exceptional efficiency and robustness in low-latency applications such as network security and fraud detection.

Technology Category

Application Category

📝 Abstract

The anomaly detection literature is abundant with offline methods, which require repeated access to data in memory, and impose impractical assumptions when applied to a streaming context. Existing online anomaly detection methods also generally fail to address these constraints, resorting to periodic retraining to adapt to the online context. We propose Online-iForest, a novel method explicitly designed for streaming conditions that seamlessly tracks the data generating process as it evolves over time. Experimental validation on real-world datasets demonstrated that Online-iForest is on par with online alternatives and closely rivals state-of-the-art offline anomaly detection techniques that undergo periodic retraining. Notably, Online-iForest consistently outperforms all competitors in terms of efficiency, making it a promising solution in applications where fast identification of anomalies is of primary importance such as cybersecurity, fraud and fault detection.

Problem

Research questions and friction points this paper is trying to address.

Addresses impractical offline methods for streaming data

Eliminates need for periodic retraining in online detection

Improves efficiency for fast anomaly identification

Innovation

Methods, ideas, or system contributions that make the work stand out.

Online-iForest for streaming anomaly detection

Tracks data generating process over time

Outperforms in efficiency and speed

🔎 Similar Papers

FoMo: Multi-Modal, Multi-Scale and Multi-Task Remote Sensing Foundation Models for Forest Monitoring