RAGe: A Retrieval-Augmented Generation Evaluation Framework

📅 2026-05-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses key challenges in deploying large language models for retrieval-augmented generation (RAG), including high computational overhead, rapid knowledge obsolescence, and manual dependency in component selection. The authors propose a modular evaluation framework that, for the first time, directly links hardware constraints to RAG performance. By integrating resource telemetry with an automated recommendation mechanism, the framework efficiently identifies optimal combinations of components—including document chunking strategies, embedding models, vector databases, and retrievers—for domain-specific datasets. This approach maintains high generation quality while substantially reducing resource consumption. Designed to support rapid prototyping on consumer-grade hardware, the framework enables automatic, domain-tailored RAG configuration, achieving a favorable trade-off among accuracy, efficiency, and scalability.
📝 Abstract
Deploying Large Language Model (LLM) applications, particularly those relying on Retrieval-Augmented Generation (RAG), remains challenging due to high computational demands, outdated knowledge bases, and the need to manually select optimal pipeline components. In this work, we propose a modular framework for benchmarking and guiding the efficient development of RAG applications by focusing on resource telemetry and component recommendation, suggesting the best components for a domain-specific dataset. Our approach leverages core techniques in LLM applications, including document chunking, vector databases, embedding models, and retrievers, to evaluate trade-offs among accuracy, efficiency, and scalability. By directly correlating retrieval and generation quality with underlying hardware constraints, RAGe supports researchers to identify the most effective, domain-specific RAG setups for their specific operational needs, facilitating rapid prototyping even on consumer-grade hardware.
Problem

Research questions and friction points this paper is trying to address.

Retrieval-Augmented Generation
Large Language Model
RAG evaluation
component selection
resource constraints
Innovation

Methods, ideas, or system contributions that make the work stand out.

Retrieval-Augmented Generation
modular evaluation framework
resource telemetry
component recommendation
LLM efficiency
🔎 Similar Papers
No similar papers found.
L
Larissa Guder
School of Technology, Pontifical Catholic University of Rio Grande do Sul, Av. Bento Gonçalves, Porto Alegre, Rio Grande do Sul, Brazil
J
João Pedro de Moura
School of Technology, Pontifical Catholic University of Rio Grande do Sul, Av. Bento Gonçalves, Porto Alegre, Rio Grande do Sul, Brazil
A
Arthur Accorsi
School of Technology, Pontifical Catholic University of Rio Grande do Sul, Av. Bento Gonçalves, Porto Alegre, Rio Grande do Sul, Brazil
G
Gustavo Losch do Amaral
School of Technology, Pontifical Catholic University of Rio Grande do Sul, Av. Bento Gonçalves, Porto Alegre, Rio Grande do Sul, Brazil
M
Maurício Cecílio Magnaguagno
School of Technology, Pontifical Catholic University of Rio Grande do Sul, Av. Bento Gonçalves, Porto Alegre, Rio Grande do Sul, Brazil
Felipe Meneguzzi
Felipe Meneguzzi
Professor of Computing Science, University of Aberdeen
Goal RecognitionAutomated PlanningHeuristic SearchMultiagent SystemsArtificial Intelligence
M
Marcio Sorraglia Pinho
School of Technology, Pontifical Catholic University of Rio Grande do Sul, Av. Bento Gonçalves, Porto Alegre, Rio Grande do Sul, Brazil
Dalvan Griebler
Dalvan Griebler
Associate Professor at PUCRS
Parallel ComputingAI and Data ScienceData StreamCloud ComputingProgramming Language