π€ AI Summary
RAG systems suffer from computational inefficiency due to uniform query routingβapplying identical processing pipelines to both simple and complex queries. This work proposes SymRAG, the first neural-symbolic collaborative adaptive query routing framework. It employs a lightweight neural evaluator to dynamically assess query complexity and system load in real time, thereby selecting among symbolic reasoning, neural generation, or hybrid processing paths; it further incorporates a multi-hop retrieval adaptation module. Evaluated on Llama-3.2-3B and Mistral-7B, SymRAG achieves 97.6%β100.0% accuracy on HotpotQA and DROP, reduces CPU utilization to 3.6%β6.2%, and delivers response times of 0.985β3.165 seconds. Disabling adaptivity increases latency by 169%β1151%. The core contribution is the first fine-grained, load-aware dynamic path selection mechanism for RAG, substantially improving energy efficiency and computational throughput.
π Abstract
Retrieval-Augmented Generation (RAG) systems address factual inconsistencies in Large Language Models by grounding generation in external knowledge, yet they face a fundamental efficiency problem: simple queries consume computational resources equivalent to complex multi-hop reasoning tasks. We present SymRAG, a neuro-symbolic framework that introduces adaptive query routing based on real-time complexity and system load assessments. SymRAG dynamically selects symbolic, neural, or hybrid processing paths to align resource use with query demands. Evaluated on 2,000 queries from HotpotQA and DROP using Llama-3.2-3B and Mistral-7B models, SymRAG achieves 97.6--100.0% exact match accuracy with significantly lower CPU utilization (3.6--6.2%) and processing time (0.985--3.165s). Disabling adaptive logic results in 169--1151% increase in processing time, highlighting the framework's impact. These results underscore the potential of adaptive neuro-symbolic routing for scalable, sustainable AI systems.