DrKGC: Dynamic Subgraph Retrieval-Augmented LLMs for Knowledge Graph Completion across General and Biomedical Domains

📅 2025-05-31

📈 Citations: 0

✨ Influential: 0

career value

153K/year

🤖 AI Summary

Large language models (LLMs) struggle to effectively perceive and reason over knowledge graph (KG) structures, limiting knowledge graph completion (KGC) performance. To address this, we propose a structure-aware LLM reasoning framework. Our method integrates logical rule-guided, bottom-up dynamic subgraph retrieval to precisely extract task-relevant local KG structures, and a GCN-adapter-driven structural embedding enhancement technique that enables fine-grained topological modeling and interpretable reasoning within LLMs. The framework is lightweight and efficient, unifying rule learning, structural embedding, and retrieval-augmented fine-tuning. Evaluated on four standard benchmarks—including two general-domain and two biomedical datasets—our approach achieves significant improvements over state-of-the-art methods. Case studies in the biomedical domain further demonstrate high accuracy and clinically meaningful interpretability.

Technology Category

Application Category

📝 Abstract

Knowledge graph completion (KGC) aims to predict missing triples in knowledge graphs (KGs) by leveraging existing triples and textual information. Recently, generative large language models (LLMs) have been increasingly employed for graph tasks. However, current approaches typically encode graph context in textual form, which fails to fully exploit the potential of LLMs for perceiving and reasoning about graph structures. To address this limitation, we propose DrKGC (Dynamic Subgraph Retrieval-Augmented LLMs for Knowledge Graph Completion). DrKGC employs a flexible lightweight model training strategy to learn structural embeddings and logical rules within the KG. It then leverages a novel bottom-up graph retrieval method to extract a subgraph for each query guided by the learned rules. Finally, a graph convolutional network (GCN) adapter uses the retrieved subgraph to enhance the structural embeddings, which are then integrated into the prompt for effective LLM fine-tuning. Experimental results on two general domain benchmark datasets and two biomedical datasets demonstrate the superior performance of DrKGC. Furthermore, a realistic case study in the biomedical domain highlights its interpretability and practical utility.

Problem

Research questions and friction points this paper is trying to address.

Predict missing triples in knowledge graphs using LLMs

Enhance LLM graph perception via dynamic subgraph retrieval

Improve KGC performance in general and biomedical domains

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic subgraph retrieval for KG completion

Lightweight model learns structural embeddings

GCN adapter enhances LLM fine-tuning

🔎 Similar Papers

Bridging LLMs and KGs without Fine-Tuning: Intermediate Probing Meets Subgraph-Aware Entity Descriptions