🤖 AI Summary
Existing academic knowledge graphs typically model paper entities or domain concepts in isolation, neglecting deep semantic associations among papers driven by shared concepts—leading to incomplete knowledge coverage and weak concept-paper alignment in scientific question answering. To address this, we propose a novel deep knowledge graph construction method tailored for NLP-oriented scientific QA. Our approach is the first to explicitly model cross-paper semantic associations grounded in shared domain concepts. We design a few-shot, large language model–driven framework for knowledge extraction and multi-granularity semantic parsing, and introduce citation-aware relational enhancement alongside subgraph community summarization. Evaluated on the ACL Anthology (60K+ papers), our graph comprises 620K entities and 2.27M relations. Experiments on three scientific QA benchmarks demonstrate significant improvements in both answer accuracy and interpretability.
📝 Abstract
Large language models (LLMs) have been widely applied in question answering over scientific research papers. To enhance the professionalism and accuracy of responses, many studies employ external knowledge augmentation. However, existing structures of external knowledge in scientific literature often focus solely on either paper entities or domain concepts, neglecting the intrinsic connections between papers through shared domain concepts. This results in less comprehensive and specific answers when addressing questions that combine papers and concepts. To address this, we propose a novel knowledge graph framework that captures deep conceptual relations between academic papers, constructing a relational network via intra-paper semantic elements and inter-paper citation relations. Using a few-shot knowledge graph construction method based on LLM, we develop NLP-AKG, an academic knowledge graph for the NLP domain, by extracting 620,353 entities and 2,271,584 relations from 60,826 papers in ACL Anthology. Based on this, we propose a 'sub-graph community summary' method and validate its effectiveness on three NLP scientific literature question answering datasets.