Online Isolation Forest

📅 2025-05-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing anomaly detection methods suffer from a dual limitation: offline approaches cannot process data streams in real time, while online methods typically rely on periodic retraining or storage of historical data. This paper introduces Online-iForest—the first fully online variant of Isolation Forest—capable of incremental updates via a single pass over streaming data, without retaining historical samples or performing retraining. Its core innovations include a dynamic construction mechanism for random-split trees, a node-weight decay strategy, and synergistic modeling of concept drift using a sliding time window. Evaluated on real-world streaming datasets, Online-iForest achieves detection accuracy comparable to state-of-the-art offline methods, while significantly outperforming all online baselines in inference speed. It demonstrates exceptional efficiency and robustness in low-latency applications such as network security and fraud detection.

Technology Category

Application Category

📝 Abstract
The anomaly detection literature is abundant with offline methods, which require repeated access to data in memory, and impose impractical assumptions when applied to a streaming context. Existing online anomaly detection methods also generally fail to address these constraints, resorting to periodic retraining to adapt to the online context. We propose Online-iForest, a novel method explicitly designed for streaming conditions that seamlessly tracks the data generating process as it evolves over time. Experimental validation on real-world datasets demonstrated that Online-iForest is on par with online alternatives and closely rivals state-of-the-art offline anomaly detection techniques that undergo periodic retraining. Notably, Online-iForest consistently outperforms all competitors in terms of efficiency, making it a promising solution in applications where fast identification of anomalies is of primary importance such as cybersecurity, fraud and fault detection.
Problem

Research questions and friction points this paper is trying to address.

Addresses impractical offline methods for streaming data
Eliminates need for periodic retraining in online detection
Improves efficiency for fast anomaly identification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Online-iForest for streaming anomaly detection
Tracks data generating process over time
Outperforms in efficiency and speed
🔎 Similar Papers
No similar papers found.