🤖 AI Summary
Existing academic data systems struggle to uniformly support diverse query types—such as retrieval, knowledge discovery, and generation—and lack interpretable execution mechanisms. This work proposes an intelligent data management system tailored for academic corpora, which automatically compiles natural language queries into interpretable directed acyclic graph (DAG) execution plans. The system integrates structure-aware knowledge representation, large language model–driven hybrid query planning, and a unified execution framework based on composable operators. By synergistically combining structured knowledge management, agent-based planning, and explainable execution, the approach supports the full spectrum of academic queries and significantly outperforms existing systems in effectiveness, efficiency, and interpretability, thereby establishing a practical foundation for agent-driven academic data management.
📝 Abstract
Managing the rapidly growing scholarly corpus poses significant challenges in representation, reasoning, and efficient analysis. An ideal system should unify structured knowledge management, agentic planning, and interpretable execution to support diverse scholarly queries - from retrieval to knowledge discovery and generation - at scale. Unfortunately, existing RAG and document analytics systems fail to achieve all query types simultaneously. To this end, we propose AgenticScholar, an agentic scholarly data management system that integrates a structure-aware knowledge representation layer, an LLM-centric hybrid query planning layer, and a unified execution layer with composable operators. AgenticScholar autonomously translates natural language queries into executable DAG plans, enabling end-to-end reasoning over multi-modal scholarly data. Extensive experiments demonstrate that AgenticScholar significantly outperforms existing systems in effectiveness, efficiency, and interpretability, offering a practical foundation for future research on agentic scholarly data management.