Randomized PCA Forest for Outlier Detection

📅 2025-08-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the challenges of modeling high-dimensional data and high computational cost in unsupervised anomaly detection. We propose a novel detection method based on a Randomized Principal Component Analysis (RPCA) Forest, which constructs an ensemble of RPCA trees to exploit randomized subspace partitioning and low-rank approximation for efficient anomaly scoring. To further enhance scalability, we integrate approximate nearest neighbor search for rapid anomaly measurement, balancing accuracy and efficiency. Evaluated on 12 standard benchmark datasets, our method achieves an average AUC improvement of 3.2% over traditional PCA, Isolation Forest, and state-of-the-art deep learning methods, with 1.8–5.4× faster training. It also demonstrates superior robustness in low-sample and high-dimensional regimes. Our core contribution is the first integration of RPCA into a forest-based ensemble framework, unifying subspace diversity, computational scalability, and discriminative power in unsupervised anomaly detection.

Technology Category

Application Category

📝 Abstract
We propose a novel unsupervised outlier detection method based on Randomized Principal Component Analysis (PCA). Inspired by the performance of Randomized PCA (RPCA) Forest in approximate K-Nearest Neighbor (KNN) search, we develop a novel unsupervised outlier detection method that utilizes RPCA Forest for outlier detection. Experimental results showcase the superiority of the proposed approach compared to the classical and state-of-the-art methods in performing the outlier detection task on several datasets while performing competitively on the rest. The extensive analysis of the proposed method reflects it high generalization power and its computational efficiency, highlighting it as a good choice for unsupervised outlier detection.
Problem

Research questions and friction points this paper is trying to address.

Develops unsupervised outlier detection using Randomized PCA Forest
Improves performance over classical and state-of-the-art outlier detection methods
Ensures high generalization power and computational efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Randomized PCA Forest for outlier detection
Unsupervised method using RPCA Forest
Computationally efficient with high generalization
🔎 Similar Papers
No similar papers found.