Label-Informed Outlier Detection Based on Granule Density

📅 2025-12-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing semi-supervised anomaly detection methods for heterogeneous data overlook data heterogeneity and uncertainty. To address this, we propose Label-Guided Granular Density Outlier Factor (GDOF), the first approach to embed sparse anomaly labels into a fuzzy granulation process. GDOF constructs an attribute-adaptive granular density ensemble: it models multi-granularity uncertainty via fuzzy sets, captures heterogeneous attribute structures using granular computing principles, and enhances discriminability through label-guided density estimation and attribute-correlation-weighted fusion. Extensive experiments on multiple real-world heterogeneous datasets demonstrate that GDOF achieves state-of-the-art performance with only a minimal number of labeled anomalies (e.g., 5–10 samples), significantly outperforming existing semi-supervised methods.

Technology Category

Application Category

📝 Abstract
Outlier detection, crucial for identifying unusual patterns with significant implications across numerous applications, has drawn considerable research interest. Existing semi-supervised methods typically treat data as purely numerical and} in a deterministic manner, thereby neglecting the heterogeneity and uncertainty inherent in complex, real-world datasets. This paper introduces a label-informed outlier detection method for heterogeneous data based on Granular Computing and Fuzzy Sets, namely Granule Density-based Outlier Factor (GDOF). Specifically, GDOF first employs label-informed fuzzy granulation to effectively represent various data types and develops granule density for precise density estimation. Subsequently, granule densities from individual attributes are integrated for outlier scoring by assessing attribute relevance with a limited number of labeled outliers. Experimental results on various real-world datasets show that GDOF stands out in detecting outliers in heterogeneous data with a minimal number of labeled outliers. The integration of Fuzzy Sets and Granular Computing in GDOF offers a practical framework for outlier detection in complex and diverse data types. All relevant datasets and source codes are publicly available for further research. This is the author's accepted manuscript of a paper published in IEEE Transactions on Fuzzy Systems. The final version is available at https://doi.org/10.1109/TFUZZ.2024.3514853
Problem

Research questions and friction points this paper is trying to address.

Detects outliers in heterogeneous data using limited labeled examples
Addresses data uncertainty and diversity via fuzzy granulation and density estimation
Integrates granular computing and fuzzy sets for complex data outlier detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses label-informed fuzzy granulation for heterogeneous data
Integrates granule densities from attributes for outlier scoring
Combines Fuzzy Sets and Granular Computing for detection framework
🔎 Similar Papers
No similar papers found.
B
Baiyang Chen
College of Computer Science, Sichuan University, Chengdu 610065, China
Zhong Yuan
Zhong Yuan
Penn State Univeristy
Deep Learning in Health CareDiffusion Model
Dezhong Peng
Dezhong Peng
Sichuan University
Multi-modal LearningMultimedia AnalysisNeural Network
H
Hongmei Chen
School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu 611756, China
X
Xiaomin Song
Sichuan National Innovation New Vision UHD Video Technology Co., Ltd., Chengdu 610095, China
Huiming Zheng
Huiming Zheng
Peking University
Image CompressionVideo CompressionPoint Cloud Compression