🤖 AI Summary
Financial fraud detection struggles to identify previously unseen fraud types (novel fraud types) and lacks interpretable, traceable evidence chains for expert validation.
Method: We propose an end-to-end interpretable fraud discovery framework based on micro-clusters. It integrates multi-source feature engineering with fine-grained clustering to detect latent fraudulent micro-clusters in million-scale real-world transaction data. Crucially, detection and interpretability are deeply coupled: visual heatmaps and an interactive dashboard jointly construct auditable evidence chains to support expert analysis.
Contribution/Results: Our method uncovers three novel anomalous behavioral patterns; two were confirmed by domain experts as either fraudulent or high-risk, and hundreds of previously undetected fraudulent transactions were successfully identified. This work represents the first approach to automatically discover unknown fraud types while enabling full attribution and traceability. The framework is currently undergoing evaluation in production.
📝 Abstract
Given a set of financial transactions (who buys from whom, when, and for how much), as well as prior information from buyers and sellers, how can we find fraudulent transactions? If we have labels for some transactions for known types of fraud, we can build a classifier. However, we also want to find new types of fraud, still unknown to the domain experts ('Detection'). Moreover, we also want to provide evidence to experts that supports our opinion ('Justification'). In this paper, we propose FRAUDGUESS, to achieve two goals: (a) for 'Detection', it spots new types of fraud as micro-clusters in a carefully designed feature space; (b) for 'Justification', it uses visualization and heatmaps for evidence, as well as an interactive dashboard for deep dives. FRAUDGUESS is used in real life and is currently considered for deployment in an Anonymous Financial Institution (AFI). Thus, we also present the three new behaviors that FRAUDGUESS discovered in a real, million-scale financial dataset. Two of these behaviors are deemed fraudulent or suspicious by domain experts, catching hundreds of fraudulent transactions that would otherwise go un-noticed.