Beyond Chunk-Local Extraction: Cross-Chunk Graph Augmentation for GraphRAG

📅 2026-05-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing GraphRAG approaches are limited to constructing knowledge graphs within individual text chunks, failing to capture cross-chunk relationships and thereby hindering performance on complex question answering. This work proposes CrossAug, a method that introduces a self-supervised graph corruption mechanism and a topology-aware graph neural network (GNN) during offline indexing to efficiently identify regions requiring cross-chunk relation completion. These gaps are then precisely filled via an evidence-driven large language model (LLM), avoiding combinatorial explosion. By integrating self-supervised graph learning, GNNs, and subgraph scoring techniques, CrossAug significantly enhances the performance of three mainstream GraphRAG frameworks across four multi-hop and long-document question answering benchmarks, demonstrating the effectiveness and generality of cross-chunk graph augmentation.
📝 Abstract
GraphRAG extends retrieval-augmented generation by organizing corpora as explicit knowledge graphs, enabling graph-based retrieval for complex question answering. However, existing frameworks extract entities and relations within individual chunks, leaving cross-chunk relations -- those whose evidence spans multiple passages -- systematically absent from the index. Exhaustive LLM-based recovery of such relations is impractical due to the combinatorial explosion of chunk combinations. We present CrossAug, a GNN-guided CROSS-Chunk Graph AUGmentation method that enriches GraphRAG indices with cross-chunk relational structure as an offline step before query-time retrieval. CrossAug derives training supervision through self-supervised graph corruption, uses a topology-aware GNN to score subgraphs for missingness, and applies evidence-grounded LLM completion only to selected high-scoring regions. Experiments on three LLM-based GraphRAG frameworks across four multi-hop and long-document QA benchmarks demonstrate that CrossAug consistently improves performance, confirming the benefit of cross-chunk graph augmentation for retrieval-based question answering. Our code is available at https://github.com/DonFinliani/CrossAug.
Problem

Research questions and friction points this paper is trying to address.

GraphRAG
cross-chunk relations
knowledge graph
retrieval-augmented generation
multi-hop QA
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-Chunk Relation Extraction
GraphRAG
Graph Neural Network
Self-Supervised Graph Corruption
Retrieval-Augmented Generation
🔎 Similar Papers
No similar papers found.