CypherBench: Towards Precise Retrieval over Full-scale Modern Knowledge Graphs in the LLM Era

📅 2024-12-24

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

Large-scale modern knowledge graphs (e.g., Wikidata) pose significant accuracy and efficiency bottlenecks for LLM-based structured retrieval due to opaque identifiers, heterogeneous relations, and massive scale. Method: We propose a semantics-preserving RDF-to-property-graph transformation paradigm, enabling efficient Cypher-native graph querying. Contribution/Results: We introduce CypherBench—the first Cypher-centric benchmark covering 11 domains, 7.8 million entities, and over 10,000 questions—and design a composite evaluation framework grounded in semantic equivalence and execution accuracy. Experiments demonstrate substantial improvements in recall and Cypher generation fidelity for complex graph queries, establishing a scalable, production-ready knowledge retrieval infrastructure for GraphRAG and related systems.

Technology Category

Application Category

📝 Abstract

Retrieval from graph data is crucial for augmenting large language models (LLM) with both open-domain knowledge and private enterprise data, and it is also a key component in the recent GraphRAG system (edge et al., 2024). Despite decades of research on knowledge graphs and knowledge base question answering, leading LLM frameworks (e.g. Langchain and LlamaIndex) have only minimal support for retrieval from modern encyclopedic knowledge graphs like Wikidata. In this paper, we analyze the root cause and suggest that modern RDF knowledge graphs (e.g. Wikidata, Freebase) are less efficient for LLMs due to overly large schemas that far exceed the typical LLM context window, use of resource identifiers, overlapping relation types and lack of normalization. As a solution, we propose property graph views on top of the underlying RDF graph that can be efficiently queried by LLMs using Cypher. We instantiated this idea on Wikidata and introduced CypherBench, the first benchmark with 11 large-scale, multi-domain property graphs with 7.8 million entities and over 10,000 questions. To achieve this, we tackled several key challenges, including developing an RDF-to-property graph conversion engine, creating a systematic pipeline for text-to-Cypher task generation, and designing new evaluation metrics.

Problem

Research questions and friction points this paper is trying to address.

Large Language Models

Knowledge Graph Retrieval

Information Accessibility

Innovation

Methods, ideas, or system contributions that make the work stand out.

CypherBench

Attribute Graphs

Cypher Query Language

🔎 Similar Papers

Fine-Grained Stateful Knowledge Exploration: A Novel Paradigm for Integrating Knowledge Graphs with Large Language Models