MOTGNN: Interpretable Graph Neural Networks for Multi-Omics Disease Classification

📅 2025-08-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of low interpretability and suboptimal accuracy in disease binary classification arising from high-dimensional, heterogeneous multi-omics data (DNA methylation, mRNA, miRNA), complex intra- and inter-omic interactions, and severe class imbalance, this paper proposes an interpretable graph neural network (GNN) framework. Methodologically, it (i) introduces the first XGBoost-driven supervised learning approach to construct modality-specific sparse graphs that explicitly encode biological relationships within and across omics layers; (ii) designs a hybrid architecture integrating hierarchical GNNs and deep feedforward networks to enable disentangled feature learning and cross-omic fusion; and (iii) inherently supports identification of key biomarkers and quantification of per-omic contribution. Evaluated on three real-world disease datasets, the method achieves 5–10% improvements in ROC-AUC, accuracy, and F1-score over state-of-the-art methods, reaching an F1-score of 87.2% under extreme class imbalance—demonstrating superior performance, robustness, and biological interpretability.

Technology Category

Application Category

📝 Abstract
Integrating multi-omics data, such as DNA methylation, mRNA expression, and microRNA (miRNA) expression, offers a comprehensive view of the biological mechanisms underlying disease. However, the high dimensionality and complex interactions among omics layers present major challenges for predictive modeling. We propose Multi-Omics integration with Tree-generated Graph Neural Network (MOTGNN), a novel and interpretable framework for binary disease classification. MOTGNN employs eXtreme Gradient Boosting (XGBoost) to perform omics-specific supervised graph construction, followed by modality-specific Graph Neural Networks (GNNs) for hierarchical representation learning, and a deep feedforward network for cross-omics integration. On three real-world disease datasets, MOTGNN outperforms state-of-the-art baselines by 5-10% in accuracy, ROC-AUC, and F1-score, and remains robust to severe class imbalance (e.g., 87.2% vs. 33.4% F1 on imbalanced data). The model maintains computational efficiency through sparse graphs (2.1-2.8 edges per node) and provides built-in interpretability, revealing both top-ranked biomarkers and the relative contributions of each omics modality. These results highlight MOTGNN's potential to improve both predictive accuracy and interpretability in multi-omics disease modeling.
Problem

Research questions and friction points this paper is trying to address.

Integrating high-dimensional multi-omics data for disease classification
Modeling complex interactions among DNA methylation, mRNA, and miRNA
Improving interpretability and accuracy in imbalanced disease datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

XGBoost for supervised graph construction
Modality-specific GNNs for representation learning
Deep feedforward network for cross-omics integration
🔎 Similar Papers
No similar papers found.