Addressing Hallucinations in Language Models with Knowledge Graph Embeddings as an Additional Modality

📅 2024-11-18
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address pervasive factual hallucinations in large language model (LLM) generation, this paper proposes a lightweight, retrieval-free, and LLM-finetuning-free knowledge injection method: for the first time, knowledge graph (KG) embeddings are treated as an independent modality and directly aligned to the LLM’s latent space via a lightweight trainable adapter, enabling cross-modal fusion of KG and linguistic representations. We construct WikiEntities—a large-scale entity alignment dataset with over 3 million samples—and generate KG embeddings using PyTorch-BigGraph. Evaluated on HaluEval, True-False, and FEVER benchmarks, our approach significantly improves factual accuracy and effectively suppresses hallucinations without degrading the LLM’s performance on other tasks. The core contribution is the introduction of the “KG embedding-as-modality” alignment paradigm, enabling efficient, general-purpose, and zero-shot knowledge enhancement—achieving robust factual grounding while preserving pre-trained capabilities.

Technology Category

Application Category

📝 Abstract
In this paper we present an approach to reduce hallucinations in Large Language Models (LLMs) by incorporating Knowledge Graphs (KGs) as an additional modality. Our method involves transforming input text into a set of KG embeddings and using an adapter to integrate these embeddings into the language model space, without relying on external retrieval processes. To facilitate this, we created WikiEntities, a dataset containing over 3 million Wikipedia texts annotated with entities from Wikidata and their corresponding embeddings from PyTorch-BigGraph. This dataset serves as a valuable resource for training Entity Linking models and adapting the described method to various LLMs using specialized adapters. Our method does not require fine-tuning of the language models themselves; instead, we only train the adapter. This ensures that the model's performance on other tasks is not affected. We trained an adapter for the Mistral 7B, LLaMA 2-7B (chat), and LLaMA 3-8B (instruct) models using this dataset and demonstrated that our approach improves performance on the HaluEval, True-False benchmarks and FEVER dataset. The results indicate that incorporating KGs as a new modality can effectively reduce hallucinations and improve the factual accuracy of language models, all without the need for external retrieval.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Factuality Errors
Misimagination
Innovation

Methods, ideas, or system contributions that make the work stand out.

Knowledge Graph Integration
Adapter Training
Enhanced Reasoning
🔎 Similar Papers
No similar papers found.