🤖 AI Summary
Graph anomaly detection (GAD) suffers significant performance degradation under train-test distribution shift, primarily due to unseen normal samples causing semantic confusion (misclassifying novel normals as anomalies) and aggregation contamination (distorting representations of known normal nodes). To address this, we propose TUNE, a lightweight test-time adaptation framework that enables online correction without retraining. TUNE introduces a graph aligner to minimize attribute representation shift and jointly optimizes semantic discrimination and neighborhood aggregation based on estimated aggregation contamination severity. Designed as a plug-and-play module, TUNE is compatible with diverse pre-trained GAD models. Extensive experiments across 10 real-world datasets demonstrate that TUNE substantially improves robustness against both synthetic and real-world unseen normal patterns, while effectively reducing false positive rates.
📝 Abstract
Graph anomaly detection (GAD), which aims to detect outliers in graph-structured data, has received increasing research attention recently. However, existing GAD methods assume identical training and testing distributions, which is rarely valid in practice. In real-world scenarios, unseen but normal samples may emerge during deployment, leading to a normality shift that degrades the performance of GAD models trained on the original data. Through empirical analysis, we reveal that the degradation arises from (1) semantic confusion, where unseen normal samples are misinterpreted as anomalies due to their novel patterns, and (2) aggregation contamination, where the representations of seen normal nodes are distorted by unseen normals through message aggregation. While retraining or fine-tuning GAD models could be a potential solution to the above challenges, the high cost of model retraining and the difficulty of obtaining labeled data often render this approach impractical in real-world applications. To bridge the gap, we proposed a lightweight and plug-and-play Test-time adaptation framework for correcting Unseen Normal pattErns (TUNE) in GAD. To address semantic confusion, a graph aligner is employed to align the shifted data to the original one at the graph attribute level. Moreover, we utilize the minimization of representation-level shift as a supervision signal to train the aligner, which leverages the estimated aggregation contamination as a key indicator of normality shift. Extensive experiments on 10 real-world datasets demonstrate that TUNE significantly enhances the generalizability of pre-trained GAD models to both synthetic and real unseen normal patterns.