Hierarchical Reference Sets for Robust Unsupervised Detection of Scattered and Clustered Outliers

📅 2026-03-13

📈 Citations: 0

✨ Influential: 0

career value

234K/year

🤖 AI Summary

This work addresses the challenge of detecting both scattered and clustered anomalies in IoT data, where the latter—due to their high local density—are often misclassified as normal instances, thereby degrading detection performance. To tackle this issue, the authors propose an unsupervised graph-based anomaly detection method that constructs natural neighbor relationships among data points and introduces, for the first time, a hierarchical reference set mechanism. This mechanism enables coordinated anomaly assessment across multiple scales—both local and global—effectively distinguishing between the two types of anomalies while preventing mutual interference. Experimental results demonstrate that the proposed approach significantly outperforms existing methods, not only improving anomaly detection accuracy but also enhancing downstream clustering performance, all while exhibiting strong robustness to hyperparameter settings.

Technology Category

Application Category

📝 Abstract

Most real-world IoT data analysis tasks, such as clustering and anomaly event detection, are unsupervised and highly susceptible to the presence of outliers. In addition to sporadic scattered outliers caused by factors such as faulty sensor readings, IoT systems often exhibit clustered outliers. These occur when multiple devices or nodes produce similar anomalous measurements, for instance, owing to localized interference, emerging security threats, or regional false alarms, forming micro-clusters. These clustered outliers can be easily mistaken for normal behavior because of their relatively high local density, thereby obscuring the detection of both scattered and contextual anomalies. To address this, we propose a novel outlier detection paradigm that leverages the natural neighboring relationships using graph structures. This facilitates multi-perspective anomaly evaluation by incorporating reference sets at both local and global scales derived from the graph. Our approach enables the effective recognition of scattered outliers without interference from clustered anomalies, whereas the graph structure simultaneously helps reflect and isolate clustered outlier groups. Extensive experiments, including comparative performance analysis, ablation studies, validation on downstream clustering tasks, and evaluation of hyperparameter sensitivity, demonstrate the efficacy of the proposed method. The source code is available at https://github.com/gordonlok/DROD.

Problem

Research questions and friction points this paper is trying to address.

outlier detection

clustered outliers

scattered outliers

unsupervised learning

IoT data

Innovation

Methods, ideas, or system contributions that make the work stand out.

hierarchical reference sets

graph-based outlier detection

clustered outliers