RAGe: A Retrieval-Augmented Generation Evaluation Framework

📅 2026-05-23

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

This work addresses key challenges in deploying large language models for retrieval-augmented generation (RAG), including high computational overhead, rapid knowledge obsolescence, and manual dependency in component selection. The authors propose a modular evaluation framework that, for the first time, directly links hardware constraints to RAG performance. By integrating resource telemetry with an automated recommendation mechanism, the framework efficiently identifies optimal combinations of components—including document chunking strategies, embedding models, vector databases, and retrievers—for domain-specific datasets. This approach maintains high generation quality while substantially reducing resource consumption. Designed to support rapid prototyping on consumer-grade hardware, the framework enables automatic, domain-tailored RAG configuration, achieving a favorable trade-off among accuracy, efficiency, and scalability.

📝 Abstract

Deploying Large Language Model (LLM) applications, particularly those relying on Retrieval-Augmented Generation (RAG), remains challenging due to high computational demands, outdated knowledge bases, and the need to manually select optimal pipeline components. In this work, we propose a modular framework for benchmarking and guiding the efficient development of RAG applications by focusing on resource telemetry and component recommendation, suggesting the best components for a domain-specific dataset. Our approach leverages core techniques in LLM applications, including document chunking, vector databases, embedding models, and retrievers, to evaluate trade-offs among accuracy, efficiency, and scalability. By directly correlating retrieval and generation quality with underlying hardware constraints, RAGe supports researchers to identify the most effective, domain-specific RAG setups for their specific operational needs, facilitating rapid prototyping even on consumer-grade hardware.

Problem

Research questions and friction points this paper is trying to address.

Retrieval-Augmented Generation

Large Language Model

RAG evaluation

component selection

resource constraints

Innovation

Methods, ideas, or system contributions that make the work stand out.

Retrieval-Augmented Generation

modular evaluation framework

resource telemetry