🤖 AI Summary
Traditional drug repositioning methods rely heavily on known drug–disease similarity metrics, suffer from poor generalizability across disease domains, and incur high experimental costs. To address these limitations, this paper proposes a two-stage multi-omics framework integrating unsupervised deep embedding clustering with supervised graph neural networks (GNNs). First, an autoencoder compresses multi-omics features to enable high-quality unsupervised clustering of 9,022 drugs into 35 clusters (average silhouette coefficient: 0.855). Second, a drug–disease heterogeneous biological network is constructed, and GNN-based link prediction is performed. The model achieves 0.901 accuracy and 0.960 AUC in drug–disease association prediction, yielding 477 novel associations with >99% confidence. This approach overcomes similarity dependency, enables functional repurposing of abandoned drugs, and facilitates cross-disease indication discovery.
📝 Abstract
Drug repurposing has historically been an economically infeasible process for identifying novel uses for abandoned drugs. Modern machine learning has enabled the identification of complex biochemical intricacies in candidate drugs; however, many studies rely on simplified datasets with known drug-disease similarities. We propose a machine learning pipeline that uses unsupervised deep embedded clustering, combined with supervised graph neural network link prediction to identify new drug-disease links from multi-omic data. Unsupervised autoencoder and cluster training reduced the dimensionality of omic data into a compressed latent embedding. A total of 9,022 unique drugs were partitioned into 35 clusters with a mean silhouette score of 0.8550. Graph neural networks achieved strong statistical performance, with a prediction accuracy of 0.901, receiver operating characteristic area under the curve of 0.960, and F1-Score of 0.901. A ranked list comprised of 477 per-cluster link probabilities exceeding 99 percent was generated. This study could provide new drug-disease link prospects across unrelated disease domains, while advancing the understanding of machine learning in drug repurposing studies.