🤖 AI Summary
To address the severe label scarcity (labeling rate < 1%) in credit card fraud detection, this paper proposes a semi-supervised temporal graph learning framework. Methodologically, it constructs a dynamic temporal transaction graph to capture inter-transaction dependencies, designs a Gated Temporal Attention Network (GTAN) for node representation learning, and—novelly—introduces an attribute-driven risk propagation mechanism to explicitly model the diffusion of fraudulent patterns over the graph. The core contribution lies in the deep integration of attribute-aware temporal graph neural networks with risk propagation, enabling highly discriminative representation learning under extremely low-label regimes. Experiments on three real-world datasets demonstrate that the proposed method consistently outperforms state-of-the-art approaches: it achieves comparable performance to fully supervised models using only 0.5% labeled data, significantly improving detection accuracy and generalization capability at ultra-sparse labeling rates.
📝 Abstract
Credit card fraud incurs a considerable cost for both cardholders and issuing banks. Contemporary methods apply machine learning-based classifiers to detect fraudulent behavior from labeled transaction records. But labeled data are usually a small proportion of billions of real transactions due to expensive labeling costs, which implies that they do not well exploit many natural features from unlabeled data. Therefore, we propose a semi-supervised graph neural network for fraud detection. Specifically, we leverage transaction records to construct a temporal transaction graph, which is composed of temporal transactions (nodes) and interactions (edges) among them. Then we pass messages among the nodes through a Gated Temporal Attention Network (GTAN) to learn the transaction representation. We further model the fraud patterns through risk propagation among transactions. The extensive experiments are conducted on a real-world transaction dataset and two publicly available fraud detection datasets. The result shows that our proposed method, namely GTAN, outperforms other state-of-the-art baselines on three fraud detection datasets. Semi-supervised experiments demonstrate the excellent fraud detection performance of our model with only a tiny proportion of labeled data.