Assessing LLMs for Serendipity Discovery in Knowledge Graphs: A Case for Drug Repurposing

📅 2025-11-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) struggle to identify “serendipitous discoveries”—unexpected yet scientifically valuable answers—in scientific knowledge graphs, particularly in drug repurposing. Method: We formalize the “serendipity-aware” knowledge graph question answering (KGQA) task, propose SerenQA—a unified framework integrating LLMs, knowledge graph retrieval, subgraph reasoning, and serendipity quantification—and introduce a principled evaluation metric balancing relevance, novelty, and surprise. We release an expert-annotated benchmark dataset and a three-stage evaluation protocol. Contribution/Results: Experiments reveal that while state-of-the-art LLMs excel at factual retrieval, they exhibit significant limitations in detecting truly unexpected, translationally viable drug-repurposing hypotheses. SerenQA establishes a reproducible benchmark and a novel paradigm for evaluating and advancing LLMs’ scientific insight capabilities—especially their capacity for serendipitous discovery in structured biomedical knowledge.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) have greatly advanced knowledge graph question answering (KGQA), yet existing systems are typically optimized for returning highly relevant but predictable answers. A missing yet desired capacity is to exploit LLMs to suggest surprise and novel ("serendipitious") answers. In this paper, we formally define the serendipity-aware KGQA task and propose the SerenQA framework to evaluate LLMs' ability to uncover unexpected insights in scientific KGQA tasks. SerenQA includes a rigorous serendipity metric based on relevance, novelty, and surprise, along with an expert-annotated benchmark derived from the Clinical Knowledge Graph, focused on drug repurposing. Additionally, it features a structured evaluation pipeline encompassing three subtasks: knowledge retrieval, subgraph reasoning, and serendipity exploration. Our experiments reveal that while state-of-the-art LLMs perform well on retrieval, they still struggle to identify genuinely surprising and valuable discoveries, underscoring a significant room for future improvements. Our curated resources and extended version are released at: https://cwru-db-group.github.io/serenQA.
Problem

Research questions and friction points this paper is trying to address.

Evaluating LLMs for discovering serendipitous insights in knowledge graphs
Assessing LLMs' ability to suggest novel drug repurposing opportunities
Measuring LLMs' capacity to identify surprising scientific discoveries
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces SerenQA framework for serendipity evaluation
Defines metrics combining relevance, novelty and surprise
Proposes structured pipeline with three reasoning subtasks
🔎 Similar Papers
No similar papers found.
Mengying Wang
Mengying Wang
Case Western Reserve University
Data ManagementML SystemsKnowledge GraphsAgentic Workflow
C
Chenhui Ma
Case Western Reserve University, Cleveland, OH, USA
A
Ao Jiao
Case Western Reserve University, Cleveland, OH, USA
Tuo Liang
Tuo Liang
Case Western
VLMVisual ReasoningVisual Hallucination
P
Pengjun Lu
Case Western Reserve University, Cleveland, OH, USA
S
Shrinidhi Hegde
Case Western Reserve University, Cleveland, OH, USA
Y
Yu Yin
Case Western Reserve University, Cleveland, OH, USA
E
Evren Gurkan-Cavusoglu
Case Western Reserve University, Cleveland, OH, USA
Yinghui Wu
Yinghui Wu
Case Western Reserve University
big datadatabasesknowledge basesdata miningdata quality