🤖 AI Summary
This study addresses the challenges of knowledge fragmentation and interdisciplinary integration exacerbated by the explosion of academic information, where existing retrieval systems—lacking topological reasoning capabilities—often induce logical hallucinations in AI agents and incur high reasoning costs. To overcome these limitations, the authors construct SciAtlas, a large-scale heterogeneous academic knowledge graph spanning 26 disciplines, encompassing 43 million papers, 157 million entities, and 3 billion triples. They further propose a neuro-symbolic hybrid retrieval algorithm that integrates tri-path collaborative recall with graph-based re-ranking, enabling a paradigm shift from semantic matching to deterministic relational discovery. This work establishes, for the first time, a panoramic scientific evolution network as a cognitive foundation for AI, substantially reducing reasoning overhead while supporting efficient literature synthesis, research trend analysis, innovation identification, and scholarly trajectory exploration. The knowledge graph, along with retrieval and downstream task APIs, is publicly released.
📝 Abstract
The exponential growth of global academic output has confronted researchers and AI agents with an unprecedented ``information explosion,'' where fragmented and unstructured knowledge organization impedes deep interdisciplinary integration. Current academic retrieval tools predominantly rely on superficial keyword matching or vector-space semantic retrieval, which lack the topological reasoning capabilities required to navigate complex logical connections. Agentic deep-research-based frameworks are often prone to logical hallucinations and consuming high inference costs. To bridge this gap, in this report, we introduce SciAtlas, a large-scale, multi-disciplinary, heterogeneous academic resource knowledge graph designed as a panoramic scientific evolution network. By integrating over 43M papers from 26 disciplines, and a total of 157M entities and 3B triplets, SciAtlas provides a structured topological cognitive substrate that dismantles disciplinary barriers and furnishes AI agents with a global perspective. Furthermore, we develop a neuro-symbolic retrieval algorithm featuring tri-path collaborative recall and graph reranking, achieving a seamless transition from simple semantic matching to deterministic association discovery. We also present key application directions of SciAtlas, including literature review, automated research trend synthesis, idea positioning, and academic trajectory exploration, to demonstrate that SciAtlas can serve as an effective ``cognitive map'' to empower the full loop of automated scientific research while significantly reducing reasoning costs. We have released the interfaces for KG retrieval and various downstream tasks in our GitHub repo.