🤖 AI Summary
Large language models (LLMs) exhibit limitations in complex reasoning and contextual understanding. This paper proposes a knowledge graph (KG)-enhanced fine-tuning approach for T5, wherein structured KG embeddings are injected into the T5 encoder to strengthen modeling of entity relations and deep semantics. We conduct the first systematic empirical study establishing a significant positive correlation between KG scale and T5’s reasoning performance, confirming that entity and relation embeddings constitute the primary source of improvement—and demonstrating that full KGs consistently outperform pruned subgraphs. On SQuAD 1.1, our method substantially surpasses standard baselines; ablation studies further reveal pronounced accuracy gains—especially on multi-hop and other complex questions. Our core contributions are threefold: (1) empirically establishing the KG-scale–performance relationship; (2) validating the necessity of KG completeness for optimal performance; and (3) introducing a scalable, structured knowledge injection paradigm compatible with encoder-only transformer architectures.
📝 Abstract
With the development of deep learning technology, large language models have achieved remarkable results in many natural language processing tasks. However, these models still have certain limitations in handling complex reasoning tasks and understanding rich background knowledge. To solve this problem, this study proposed a T5 model fine-tuning method based on knowledge graphs, which enhances the model's reasoning ability and context understanding ability by introducing external knowledge graphs. We used the SQuAD1.1 dataset for experiments. The experimental results show that the T5 model based on knowledge graphs is significantly better than other baseline models in reasoning accuracy, context understanding, and the ability to handle complex problems. At the same time, we also explored the impact of knowledge graphs of different scales on model performance and found that as the scale of the knowledge graph increases, the performance of the model gradually improves. Especially when dealing with complex problems, the introduction of knowledge graphs greatly improves the reasoning ability of the T5 model. Ablation experiments further verify the importance of entity and relationship embedding in the model and prove that a complete knowledge graph is crucial to improving the various capabilities of the T5 model. In summary, this study provides an effective method to enhance the reasoning and understanding capabilities of large language models and provides new directions for future research.