HASH-RAG: Bridging Deep Hashing with Retriever for Efficient, Fine Retrieval and Augmented Generation

📅 2025-05-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the trade-off between efficiency and fine-grained retrieval in Retrieval-Augmented Generation (RAG) over large-scale knowledge bases, this paper proposes an end-to-end learnable binary hashing framework that bypasses intermediate feature extraction and directly supports proposition-level chunking and context-enhanced generation. Methodologically, it integrates deep hashing learning, end-to-end optimization of binary codes, prompt-driven context fusion, and fine-grained chunking. Key contributions include: (1) the first joint optimization paradigm wherein hash codes directly drive both retrieval and generation; and (2) the Prompt-Guided Chunk-to-Context (PGCC) module, which aligns hash indices with original textual semantics. Experiments on NQ, TriviaQA, and HotpotQA demonstrate a 90% reduction in retrieval latency and absolute improvements of 1.4–4.3% in Exact Match (EM), achieving a compelling balance between high efficiency and high recall.

Technology Category

Application Category

📝 Abstract
Retrieval-Augmented Generation (RAG) encounters efficiency challenges when scaling to massive knowledge bases while preserving contextual relevance. We propose Hash-RAG, a framework that integrates deep hashing techniques with systematic optimizations to address these limitations. Our queries directly learn binary hash codes from knowledgebase code, eliminating intermediate feature extraction steps, and significantly reducing storage and computational overhead. Building upon this hash-based efficient retrieval framework, we establish the foundation for fine-grained chunking. Consequently, we design a Prompt-Guided Chunk-to-Context (PGCC) module that leverages retrieved hash-indexed propositions and their original document segments through prompt engineering to enhance the LLM's contextual awareness. Experimental evaluations on NQ, TriviaQA, and HotpotQA datasets demonstrate that our approach achieves a 90% reduction in retrieval time compared to conventional methods while maintaining considerate recall performance. Additionally, The proposed system outperforms retrieval/non-retrieval baselines by 1.4-4.3% in EM scores.
Problem

Research questions and friction points this paper is trying to address.

Improving retrieval efficiency in large knowledge bases
Reducing storage and computational overhead in RAG
Enhancing contextual awareness in LLMs via hash-indexing
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates deep hashing with systematic optimizations
Learns binary hash codes directly from knowledgebase
Uses Prompt-Guided Chunk-to-Context module
🔎 Similar Papers
No similar papers found.
Jinyu Guo
Jinyu Guo
University of Electronic Science and Technology of China
Natural Language Processing
X
Xunlei Chen
University of Electronic Science and Technology of China
Q
Qiyang Xia
University of Electronic Science and Technology of China
Z
Zhaokun Wang
University of Electronic Science and Technology of China
J
Jie Ou
University of Electronic Science and Technology of China
L
Libo Qin
Central South University
S
Shunyu Yao
Big data and artificial intelligent institute, China Telecom Research Institute
Wenhong Tian
Wenhong Tian
University of Electronic Science and Technology of China
Approximation Algorithms for NP-Hard ProblemsResource SchedulingNetwork Modeling and Performance Optimization