Cost-Efficient RAG for Entity Matching with LLMs: A Blocking-based Exploration

📅 2026-02-05

📈 Citations: 0

✨ Influential: 0

career value

168K/year

🤖 AI Summary

This work addresses the high computational overhead of retrieval and generation in existing RAG-based approaches for large-scale entity matching. To tackle this challenge, the authors propose CE-RAG4EM, a novel architecture that introduces chunk-aware batched retrieval and generation within a unified framework, jointly optimizing retrieval granularity and chunking strategies. The method substantially reduces the inference cost of large language models while maintaining matching accuracy on par with or superior to strong baselines. Experimental results across multiple benchmarks demonstrate that CE-RAG4EM significantly shortens end-to-end runtime, effectively revealing the trade-off between performance and computational cost in entity matching tasks.

Technology Category

Application Category

📝 Abstract

Retrieval-augmented generation (RAG) enhances LLM reasoning in knowledge-intensive tasks, but existing RAG pipelines incur substantial retrieval and generation overhead when applied to large-scale entity matching. To address this limitation, we introduce CE-RAG4EM, a cost-efficient RAG architecture that reduces computation through blocking-based batch retrieval and generation. We also present a unified framework for analyzing and evaluating RAG systems for entity matching, focusing on blocking-aware optimizations and retrieval granularity. Extensive experiments suggest that CE-RAG4EM can achieve comparable or improved matching quality while substantially reducing end-to-end runtime relative to strong baselines. Our analysis further reveals that key configuration parameters introduce an inherent trade-off between performance and overhead, offering practical guidance for designing efficient and scalable RAG systems for entity matching and data integration.

Problem

Research questions and friction points this paper is trying to address.

entity matching

retrieval-augmented generation

cost efficiency

large-scale data

computational overhead

Innovation

Methods, ideas, or system contributions that make the work stand out.

blocking-based retrieval

cost-efficient RAG

entity matching