🤖 AI Summary
Unsupervised graph anomaly detection fails when both node attributes and structural information are simultaneously missing. Method: We propose the first unified framework robust to multi-value missingness. It employs a dual-path encoder to independently reconstruct attributes and structure, thereby preventing cross-view error propagation. In the joint latent space, we introduce latent-space regularization and joint reconstruction learning, and innovatively design a hard negative sample generation strategy based on latent-space sampling to mitigate anomaly masking caused by imputation and sharpen the decision boundary between normal and anomalous distributions. Results: Extensive experiments on seven benchmark datasets demonstrate that our method significantly outperforms existing unsupervised graph anomaly detection approaches and maintains strong robustness across varying missingness rates.
📝 Abstract
Unsupervised graph anomaly detection (GAD) has received increasing attention in recent years, which aims to identify data anomalous patterns utilizing only unlabeled node information from graph-structured data. However, prevailing unsupervised GAD methods typically presuppose complete node attributes and structure information, a condition hardly satisfied in real-world scenarios owing to privacy, collection errors or dynamic node arrivals. Existing standard imputation schemes risk"repairing"rare anomalous nodes so that they appear normal, thereby introducing imputation bias into the detection process. In addition, when both node attributes and edges are missing simultaneously, estimation errors in one view can contaminate the other, causing cross-view interference that further undermines the detection performance. To overcome these challenges, we propose M$^2$V-UGAD, a multiple missing values-resistant unsupervised GAD framework on incomplete graphs. Specifically, a dual-pathway encoder is first proposed to independently reconstruct missing node attributes and graph structure, thereby preventing errors in one view from propagating to the other. The two pathways are then fused and regularized in a joint latent space so that normals occupy a compact inner manifold while anomalies reside on an outer shell. Lastly, to mitigate imputation bias, we sample latent codes just outside the normal region and decode them into realistic node features and subgraphs, providing hard negative examples that sharpen the decision boundary. Experiments on seven public benchmarks demonstrate that M$^2$V-UGAD consistently outperforms existing unsupervised GAD methods across varying missing rates.