C-3PO: Compact Plug-and-Play Proxy Optimization to Achieve Human-like Retrieval-Augmented Generation

📅 2025-02-10

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

RAG systems suffer from inefficient collaboration between retrievers and large language models (LLMs), leading to suboptimal performance and poor alignment. To address this, we propose a lightweight, plug-and-play multi-agent agent framework that requires no modifications to existing retrievers or LLMs. Our method centers on two key innovations: (1) an agent-centric architecture that emulates human iterative query-refinement and result-review behavior; and (2) a tree-structured rollout mechanism grounded in reinforcement learning, enabling end-to-end joint optimization of retrieval intent classification, query rewriting, and result filtering via fine-grained credit assignment. Evaluated on both in-domain and out-of-distribution RAG tasks, our approach achieves significant performance gains while preserving strong generalization and component agnosticism—i.e., compatibility with arbitrary off-the-shelf retrievers and LLMs.

Technology Category

Application Category

📝 Abstract

Retrieval-augmented generation (RAG) systems face a fundamental challenge in aligning independently developed retrievers and large language models (LLMs). Existing approaches typically involve modifying either component or introducing simple intermediate modules, resulting in practical limitations and sub-optimal performance. Inspired by human search behavior -- typically involving a back-and-forth process of proposing search queries and reviewing documents, we propose C-3PO, a proxy-centric framework that facilitates communication between retrievers and LLMs through a lightweight multi-agent system. Our framework implements three specialized agents that collaboratively optimize the entire RAG pipeline without altering the retriever and LLMs. These agents work together to assess the need for retrieval, generate effective queries, and select information suitable for the LLMs. To enable effective multi-agent coordination, we develop a tree-structured rollout approach for reward credit assignment in reinforcement learning. Extensive experiments in both in-domain and out-of-distribution scenarios demonstrate that C-3PO significantly enhances RAG performance while maintaining plug-and-play flexibility and superior generalization capabilities.

Problem

Research questions and friction points this paper is trying to address.

Aligns retrievers and LLMs effectively

Enhances RAG performance without modifications

Ensures plug-and-play flexibility and generalization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight multi-agent system

Tree-structured rollout approach

Plug-and-play flexibility

🔎 Similar Papers

No similar papers found.