DynaGRAG: Improving Language Understanding and Generation through Dynamic Subgraph Representation in Graph Retrieval-Augmented Generation

📅 2024-12-24

📈 Citations: 0

✨ Influential: 0

career value

167K/year

🤖 AI Summary

To address weak representational capacity and insufficient diversity of subgraphs in Graph Retrieval-Augmented Generation (GRAG), this paper proposes a dynamic structure-aware subgraph retrieval and fusion framework. Methodologically, it introduces (1) Dynamic Similarity-Aware BFS (DSA-BFS), the first algorithm jointly optimizing subgraph relevance, structural integrity, and traversal efficiency; (2) a query-aware deduplicated subgraph retrieval mechanism to ensure semantic diversity; and (3) a structure-aware fusion module integrating GCN encoding, LLM-guided hard prompting, and node-level deduplicated two-step mean pooling. Evaluated on multiple knowledge-intensive question answering and generation benchmarks, the framework achieves significant performance gains. Results demonstrate that high-quality, diverse subgraph representations critically enhance large language models’ ability to effectively leverage external knowledge.

Technology Category

Application Category

📝 Abstract

Graph Retrieval-Augmented Generation (GRAG or Graph RAG) architectures aim to enhance language understanding and generation by leveraging external knowledge. However, effectively capturing and integrating the rich semantic information present in textual and structured data remains a challenge. To address this, a novel GRAG framework is proposed to focus on enhancing subgraph representation and diversity within the knowledge graph. By improving graph density, capturing entity and relation information more effectively, and dynamically prioritizing relevant and diverse subgraphs, the proposed approach enables a more comprehensive understanding of the underlying semantic structure. This is achieved through a combination of de-duplication processes, two-step mean pooling of embeddings, query-aware retrieval considering unique nodes, and a Dynamic Similarity-Aware BFS (DSA-BFS) traversal algorithm. Integrating Graph Convolutional Networks (GCNs) and Large Language Models (LLMs) through hard prompting further enhances the learning of rich node and edge representations while preserving the hierarchical subgraph structure. Experimental results on multiple benchmark datasets demonstrate the effectiveness of the proposed GRAG framework, showcasing the significance of enhanced subgraph representation and diversity for improved language understanding and generation.

Problem

Research questions and friction points this paper is trying to address.

Natural Language Processing

Semantic Information

Data Diversity

Innovation

Methods, ideas, or system contributions that make the work stand out.

DynaGRAG

Subgraph Representation

Dynamic Optimization

🔎 Similar Papers

GRAG: Graph Retrieval-Augmented Generation