🤖 AI Summary
The explosive growth of scientific literature necessitates scalable, verifiable knowledge synthesis systems; however, existing retrieval-augmented generation (RAG) approaches ignore citation graph structure, struggle with complex queries, and produce fragmented, non-attributable outputs. To address these limitations, we propose the first open-source knowledge synthesis framework tailored for scientific literature. Our method innovatively integrates adaptive retrieval, citation-aware symbolic reasoning, and outline-guided generation: it dynamically retrieves evidence, filters and hierarchically organizes documents using citation graphs, and generates structured, fully traceable answers. The framework unifies RAG, graph analytics, sequence modeling, and iterative planning, supporting both parallel and sequential evidence collection. Evaluated on benchmarks including QASA and ScholarQA, it achieves significant improvements in factual accuracy and synthesis coherence. This work establishes a trustworthy, scalable paradigm for large-scale scientific knowledge aggregation.
📝 Abstract
The accelerating growth of scientific publications has intensified the need for scalable, trustworthy systems to synthesize knowledge across diverse literature. While recent retrieval-augmented generation (RAG) methods have improved access to scientific information, they often overlook citation graph structure, adapt poorly to complex queries, and yield fragmented, hard-to-verify syntheses. We introduce SciRAG, an open-source framework for scientific literature exploration that addresses these gaps through three key innovations: (1) adaptive retrieval that flexibly alternates between sequential and parallel evidence gathering; (2) citation-aware symbolic reasoning that leverages citation graphs to organize and filter supporting documents; and (3) outline-guided synthesis that plans, critiques, and refines answers to ensure coherence and transparent attribution. Extensive experiments across multiple benchmarks such as QASA and ScholarQA demonstrate that SciRAG outperforms prior systems in factual accuracy and synthesis quality, establishing a new foundation for reliable, large-scale scientific knowledge aggregation.