DeepSieve: Information Sieving via LLM-as-a-Knowledge-Router

📅 2025-07-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) struggle to dynamically access up-to-date or domain-specific knowledge for knowledge-intensive queries. Method: This paper proposes an agent-based retrieval-augmented generation (RAG) framework that recursively decomposes complex queries into structured sub-questions and employs an LLM as a “knowledge router” to perform dual fine-grained matching—aligning query intent with heterogeneous knowledge sources—and enabling dynamic routing. The framework further incorporates multi-stage information distillation and modular reasoning path construction to enhance retrieval precision, reasoning depth, and result interpretability. Contribution/Results: Experiments demonstrate that the proposed framework significantly outperforms conventional RAG approaches on multi-hop question answering benchmarks. It exhibits strong adaptability across diverse domains and query types while maintaining high transparency and traceability in its reasoning process.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) excel at many reasoning tasks but struggle with knowledge-intensive queries due to their inability to dynamically access up-to-date or domain-specific information. Retrieval-Augmented Generation (RAG) has emerged as a promising solution, enabling LLMs to ground their responses in external sources. However, existing RAG methods lack fine-grained control over both the query and source sides, often resulting in noisy retrieval and shallow reasoning. In this work, we introduce DeepSieve, an agentic RAG framework that incorporates information sieving via LLM-as-a-knowledge-router. DeepSieve decomposes complex queries into structured sub-questions and recursively routes each to the most suitable knowledge source, filtering irrelevant information through a multi-stage distillation process. Our design emphasizes modularity, transparency, and adaptability, leveraging recent advances in agentic system design. Experiments on multi-hop QA tasks across heterogeneous sources demonstrate improved reasoning depth, retrieval precision, and interpretability over conventional RAG approaches.
Problem

Research questions and friction points this paper is trying to address.

LLMs struggle with dynamic knowledge access for complex queries
Existing RAG lacks fine-grained query and source control
Noisy retrieval and shallow reasoning limit RAG effectiveness
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-as-a-knowledge-router for query decomposition
Multi-stage distillation filters irrelevant information
Modular agentic RAG framework enhances precision