RCR-Router: Efficient Role-Aware Context Routing for Multi-Agent LLM Systems with Structured Memory

📅 2025-08-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing multi-agent LLM systems rely on static or full-context routing, leading to excessive token consumption, redundant memory usage, and poor adaptability across multi-turn interactions. To address these limitations, we propose RCR-Router—the first dynamic context routing framework jointly conditioned on agent roles and task progression stages. It incorporates structured shared memory, lightweight semantic relevance scoring, iterative context integration, and dynamic subset selection under explicit token budget constraints. Crucially, we introduce Answer Quality Score (AQ-Score), an output-aware metric enabling fine-grained evaluation of generated responses. Extensive experiments on HotPotQA, MuSiQue, and 2WikiMultihop demonstrate that RCR-Router reduces token consumption by up to 30% while maintaining or improving answer accuracy. These results validate the method’s effectiveness, robustness, and scalability in complex multi-hop reasoning tasks.

Technology Category

Application Category

📝 Abstract
Multi-agent large language model (LLM) systems have shown strong potential in complex reasoning and collaborative decision-making tasks. However, most existing coordination schemes rely on static or full-context routing strategies, which lead to excessive token consumption, redundant memory exposure, and limited adaptability across interaction rounds. We introduce RCR-Router, a modular and role-aware context routing framework designed to enable efficient, adaptive collaboration in multi-agent LLMs. To our knowledge, this is the first routing approach that dynamically selects semantically relevant memory subsets for each agent based on its role and task stage, while adhering to a strict token budget. A lightweight scoring policy guides memory selection, and agent outputs are iteratively integrated into a shared memory store to facilitate progressive context refinement. To better evaluate model behavior, we further propose an Answer Quality Score metric that captures LLM-generated explanations beyond standard QA accuracy. Experiments on three multi-hop QA benchmarks -- HotPotQA, MuSiQue, and 2WikiMultihop -- demonstrate that RCR-Router reduces token usage (up to 30%) while improving or maintaining answer quality. These results highlight the importance of structured memory routing and output-aware evaluation in advancing scalable multi-agent LLM systems.
Problem

Research questions and friction points this paper is trying to address.

Dynamic role-based memory routing for multi-agent LLMs
Reducing token usage while maintaining answer quality
Structured memory and output-aware evaluation in multi-agent systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic role-aware memory selection for agents
Lightweight scoring policy guides memory routing
Iterative output integration for context refinement
🔎 Similar Papers
No similar papers found.
J
Jun Liu
Carnegie Mellon University
Zhenglun Kong
Zhenglun Kong
Harvard University
Efficient Deep LearningLarge Language ModelAI4Science
Changdi Yang
Changdi Yang
PhD candidate, Northeastern University, Snap Inc.
Efficient Deep Learning
F
Fan Yang
Fujitsu Research of America
T
Tianqi Li
Carnegie Mellon University
P
Peiyan Dong
MIT
Joannah Nanjekye
Joannah Nanjekye
Carnegie Mellon University
H
Hao Tang
Peking University
Geng Yuan
Geng Yuan
University of Georgia
Efficient AIExplainable AITrustworthy MLEdge ComputingAI Applications
W
Wei Niu
University of Georgia
W
Wenbin Zhang
Florida International University
P
Pu Zhao
Northeastern University
Xue Lin
Xue Lin
Northeastern University
electrical and computer engineering
D
Dong Huang
Carnegie Mellon University
Y
Yanzhi Wang
Northeastern University