🤖 AI Summary
Frequent bond defaults in China’s corporate bond market pose challenges for traditional models, which struggle to simultaneously capture irregular temporal sampling patterns and provide financial interpretability. Method: We propose EMDLOT, an explainable multimodal deep learning framework that jointly models financial time series—using time-aware LSTM to handle non-uniform sampling—and prospectus text—via soft clustering preprocessing and hierarchical attention mechanisms—to predict multiple risk states (e.g., default, extension). Contribution/Results: Evaluated on 1,994 Chinese firms, EMDLOT significantly outperforms benchmarks (XGBoost, standard LSTM) in recall, F1-score, and mean average precision (mAP), especially for early-stage default and extension prediction. Moreover, attention weights and cluster-based feature attribution yield economically intuitive, interpretable drivers of risk, thereby balancing high predictive accuracy with the transparency and trustworthiness required for financial decision-making.
📝 Abstract
In recent years, China's bond market has seen a surge in defaults amid regulatory reforms and macroeconomic volatility. Traditional machine learning models struggle to capture financial data's irregularity and temporal dependencies, while most deep learning models lack interpretability-critical for financial decision-making. To tackle these issues, we propose EMDLOT (Explainable Multimodal Deep Learning for Time-series), a novel framework for multi-class bond default prediction. EMDLOT integrates numerical time-series (financial/macroeconomic indicators) and unstructured textual data (bond prospectuses), uses Time-Aware LSTM to handle irregular sequences, and adopts soft clustering and multi-level attention to boost interpretability. Experiments on 1994 Chinese firms (2015-2024) show EMDLOT outperforms traditional (e.g., XGBoost) and deep learning (e.g., LSTM) benchmarks in recall, F1-score, and mAP, especially in identifying default/extended firms. Ablation studies validate each component's value, and attention analyses reveal economically intuitive default drivers. This work provides a practical tool and a trustworthy framework for transparent financial risk modeling.