🤖 AI Summary
Knowledge Graph Completion (KGC) faces two key bottlenecks: high computational overhead in structured methods and the neglect of relational context in text-based approaches. To address these, we propose KGC-ERC, a novel generative framework that jointly models entity neighborhoods and relational context for the first time. Specifically, KGC-ERC textualizes input triples and introduces a context-aware sampling strategy that dynamically selects salient entity and relation neighborhood tokens under strict token-length constraints. These neighborhood representations are then injected into pre-trained sequence-to-sequence or encoder-only models (e.g., T5, BERT) along two complementary dimensions—input encoding and contextual conditioning. Extensive experiments demonstrate that KGC-ERC achieves state-of-the-art or competitive performance on Wikidata5M, Wiki27K, and FB15K-237-N, significantly improving both prediction accuracy and inference scalability. Our work establishes a new paradigm for lightweight, context-aware KGC grounded in generative modeling.
📝 Abstract
Knowledge Graph Completion (KGC) aims to infer missing information in Knowledge Graphs (KGs) to address their inherent incompleteness. Traditional structure-based KGC methods, while effective, face significant computational demands and scalability challenges due to the need for dense embedding learning and scoring all entities in the KG for each prediction. Recent text-based approaches using language models like T5 and BERT have mitigated these issues by converting KG triples into text for reasoning. However, they often fail to fully utilize contextual information, focusing mainly on the neighborhood of the entity and neglecting the context of the relation. To address this issue, we propose KGC-ERC, a framework that integrates both types of context to enrich the input of generative language models and enhance their reasoning capabilities. Additionally, we introduce a sampling strategy to effectively select relevant context within input token constraints, which optimizes the utilization of contextual information and potentially improves model performance. Experiments on the Wikidata5M, Wiki27K, and FB15K-237-N datasets show that KGC-ERC outperforms or matches state-of-the-art baselines in predictive performance and scalability.