K-Paths: Reasoning over Graph Paths for Drug Repurposing and Drug Interaction Prediction

📅 2025-02-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses key bottlenecks—low path retrieval efficiency, structural incompatibility, and weak interpretability—in large-scale biomedical knowledge graphs (KGs) for drug repurposing and drug–drug interaction (DDI) severity prediction. We propose a diversity-aware variant of Yen’s algorithm to efficiently extract K acyclic, biologically meaningful entity paths. Furthermore, we introduce the first end-to-end structured encoding method that transforms KG paths into LLM-parsable formats, enabling interpretable, synergistic reasoning between large language models (LLMs) and graph neural networks (GNNs). Evaluated on Llama-3-8B, our approach achieves zero-shot F1 improvements of +12.45 for drug repurposing and +13.42 for DDI severity prediction. Our EmerGNN model maintains high performance even with 90% graph size compression, demonstrating robustness and scalability. The framework bridges symbolic KG reasoning with neural LLM/GNN inference while preserving biological interpretability and computational efficiency.

Technology Category

Application Category

📝 Abstract
Drug discovery is a complex and time-intensive process that requires identifying and validating new therapeutic candidates. Computational approaches using large-scale biomedical knowledge graphs (KGs) offer a promising solution to accelerate this process. However, extracting meaningful insights from large-scale KGs remains challenging due to the complexity of graph traversal. Existing subgraph-based methods are tailored to graph neural networks (GNNs), making them incompatible with other models, such as large language models (LLMs). We introduce K-Paths, a retrieval framework that extracts structured, diverse, and biologically meaningful paths from KGs. Integrating these paths enables LLMs and GNNs to effectively predict unobserved drug-drug and drug-disease interactions. Unlike traditional path-ranking approaches, K-Paths retrieves and transforms paths into a structured format that LLMs can directly process, facilitating explainable reasoning. K-Paths employs a diversity-aware adaptation of Yen's algorithm to retrieve the K shortest loopless paths between entities in an interaction query, prioritizing biologically relevant and diverse relationships. Our experiments on benchmark datasets show that K-Paths improves the zero-shot performance of Llama 8.1B's F1-score by 12.45 points on drug repurposing and 13.42 points on interaction severity prediction. We also show that Llama 70B achieves F1-score gains of 6.18 and 8.46 points, respectively. K-Paths also improves the supervised training efficiency of EmerGNN, a state-of-the-art GNN, by reducing KG size by 90% while maintaining strong predictive performance. Beyond its scalability and efficiency, K-Paths uniquely bridges the gap between KGs and LLMs, providing explainable rationales for predicted interactions. These capabilities show that K-Paths is a valuable tool for efficient data-driven drug discovery.
Problem

Research questions and friction points this paper is trying to address.

Accelerating drug discovery using biomedical knowledge graphs
Enhancing interaction prediction between drugs and diseases
Bridging graph neural networks and large language models
Innovation

Methods, ideas, or system contributions that make the work stand out.

K-Paths extracts structured diverse paths.
Integrates paths for LLMs and GNNs.
Employs diversity-aware Yen's algorithm.