🤖 AI Summary
In traditional financial risk management, pre-service risk assessment and in-service default detection are modeled separately, limiting the accuracy of ex-ante predictions. To address this, we propose a multi-granularity knowledge distillation framework that unifies cross-stage risk modeling by integrating dynamic in-service behavioral data through coarse-grained (sequence-level), fine-grained (behavior-level), and self-distillation mechanisms. We further introduce soft-label guidance and class-aware reweighting to mitigate label imbalance and enhance the transfer of default patterns. A teacher-student architecture is designed and evaluated on Tencent’s large-scale mobile payment dataset. Experimental results demonstrate significant improvements: +3.2% in offline AUC, +4.8% in KS score, and −12.7% reduction in online bad debt rate—outperforming all baseline methods. This work advances holistic, adaptive risk modeling by enabling effective knowledge transfer across service stages.
📝 Abstract
Typical financial risk management involves distinct phases for pre-service risk assessment and in-service default detection, often modeled separately. This paper proposes a novel framework, Multi-Granularity Knowledge Distillation (abbreviated as MGKD), aimed at improving pre-service risk prediction through the integration of in-service user behavior data. MGKD follows the idea of knowledge distillation, where the teacher model, trained on historical in-service data, guides the student model, which is trained on pre-service data. By using soft labels derived from in-service data, the teacher model helps the student model improve its risk prediction prior to service activation. Meanwhile, a multi-granularity distillation strategy is introduced, including coarse-grained, fine-grained, and self-distillation, to align the representations and predictions of the teacher and student models. This approach not only reinforces the representation of default cases but also enables the transfer of key behavioral patterns associated with defaulters from the teacher to the student model, thereby improving the overall performance of pre-service risk assessment. Moreover, we adopt a re-weighting strategy to mitigate the model's bias towards the minority class. Experimental results on large-scale real-world datasets from Tencent Mobile Payment demonstrate the effectiveness of our proposed approach in both offline and online scenarios.