Faster, Cheaper, Better: Multi-Objective Hyperparameter Optimization for LLM and RAG Systems

📅 2025-02-25

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

Jointly optimizing large language models (LLMs) and retrieval-augmented generation (RAG) systems entails balancing multiple conflicting objectives—cost, latency, safety, and answer consistency—within a high-dimensional configuration space, under noisy, expensive evaluations. Method: We propose the first end-to-end multi-objective hyperparameter optimization framework for LLM+RAG pipelines, integrating Bayesian optimization with black-box multi-objective search to jointly select and tune embedding models, LLMs, re-rankers, and RAG components. We introduce two novel RAG-specific multi-objective benchmark tasks, uncovering strong configuration-task dependencies and issuing a caution regarding configuration generalizability. Results: Our framework achieves a superior Pareto frontier versus baselines, reducing cost by 23%, latency by 31%, and improving safety scores by 18%, thereby demonstrating—for the first time—the feasibility and effectiveness of simultaneous four-objective optimization in LLM+RAG systems.

Technology Category

Application Category

📝 Abstract

While Retrieval Augmented Generation (RAG) has emerged as a popular technique for improving Large Language Model (LLM) systems, it introduces a large number of choices, parameters and hyperparameters that must be made or tuned. This includes the LLM, embedding, and ranker models themselves, as well as hyperparameters governing individual RAG components. Yet, collectively optimizing the entire configuration in a RAG or LLM system remains under-explored - especially in multi-objective settings - due to intractably large solution spaces, noisy objective evaluations, and the high cost of evaluations. In this work, we introduce the first approach for multi-objective parameter optimization of cost, latency, safety and alignment over entire LLM and RAG systems. We find that Bayesian optimization methods significantly outperform baseline approaches, obtaining a superior Pareto front on two new RAG benchmark tasks. We conclude our work with important considerations for practitioners who are designing multi-objective RAG systems, highlighting nuances such as how optimal configurations may not generalize across tasks and objectives.

Problem

Research questions and friction points this paper is trying to address.

Optimize multi-objective hyperparameters for LLM and RAG systems

Address large solution spaces and noisy evaluations

Improve cost, latency, safety, and alignment in configurations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-objective hyperparameter optimization

Bayesian optimization methods

Pareto front benchmark tasks

🔎 Similar Papers

Large Language Model Agent for Hyper-Parameter Optimization