ContextRAG: Extraction-Free Hierarchical Graph Construction for Retrieval-Augmented Generation

📅 2026-05-19

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

This work addresses the high computational and token costs incurred during indexing in conventional graph-based RAG systems, which rely on large language models (LLMs) for entity and relation extraction—a burden that escalates rapidly with corpus size. The authors propose the first LLM-free hierarchical graph construction method, leveraging text chunk embeddings combined with residual quantized K-means clustering and formal concept analysis under Łukasiewicz residuated fuzzy logic to build a fuzzy concept graph. Contextual nodes are then generated via soft fuzzy join and meet operations. This approach drastically reduces indexing overhead: on the UltraDomain subset of 130 tasks, it invokes the LLM only 30 times and consumes just 22,073 tokens, achieving an overall F1 score of 33.6% and 36.8% on multi-hop tasks. Queries retrieving lattice-derived nodes show a 3.9-percentage-point F1 improvement.

📝 Abstract

Graph-structured retrieval-augmented generation (RAG) systems can improve answer quality on multi-hop questions, but many current systems rely on large language models (LLMs) to extract entities, relations, and summaries during indexing. These calls add token and wall-clock costs that grow with corpus size. We present ContextRAG, a graph RAG system whose graph topology is constructed without LLM-based entity or relation extraction. ContextRAG derives a fuzzy concept graph over chunk embeddings using residual-quantization k-means and Formal Concept Analysis with Lukasiewicz residuated logic. Bridge-like and meet-derived context nodes are induced by soft fuzzy join and meet operations, rather than by LLM-written graph edges. On a 130-task UltraDomain subset, ContextRAG builds its index with 30 LLM calls and 22,073 tokens. In contrast, a local HiRAG reproduction stress test required 870 indexing calls and 3.54M tokens on a 20-task subset before failing during graph construction; linear extrapolation to 130 tasks implies over 23M indexing tokens. ContextRAG obtains 33.6% F1 overall and 36.8% F1 on multi-hop tasks. An activation analysis shows that queries retrieving at least one lattice-derived node in the top five achieve +3.9 percentage points F1 over queries that do not; this association is diagnostic rather than causal.

Problem

Research questions and friction points this paper is trying to address.

Retrieval-Augmented Generation

Graph Construction

Entity Extraction

Large Language Models

Indexing Cost

Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph RAG

Extraction-Free Construction

Formal Concept Analysis