KBLaM: Knowledge Base augmented Language Model

📅 2024-10-14

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

To address the limitations of external retrieval dependency, high computational overhead, and infrequent knowledge updates in large language model (LLM) knowledge enhancement, this paper proposes KBLaM—a retrieval-free, fine-tuning-free knowledge injection framework. KBLaM encodes structured knowledge bases (>10K triples) into continuous key-value vector pairs and integrates them end-to-end into an 8B-parameter LLM via a customized rectangular attention mechanism, enabling dynamic, single-GPU knowledge loading and real-time updates. Its computational complexity scales linearly with knowledge base size and supports interpretable tracking of knowledge usage. Experiments demonstrate that KBLaM significantly outperforms retrieval-augmented generation (RAG) and in-context learning (ICL) baselines on question answering and open-ended reasoning tasks. Deployed on a single A100 GPU, it achieves low latency and high interpretability, establishing the first efficient, scalable, plug-and-play paradigm for internalizing structured knowledge into LLMs.

Technology Category

Application Category

📝 Abstract

In this paper, we propose Knowledge Base augmented Language Model (KBLaM), a new method for augmenting Large Language Models (LLMs) with external knowledge. KBLaM works with a knowledge base (KB) constructed from a corpus of documents, transforming each piece of knowledge in the KB into continuous key-value vector pairs via pre-trained sentence encoders with linear adapters and integrating them into pre-trained LLMs via a specialized rectangular attention mechanism. Unlike Retrieval-Augmented Generation, KBLaM eliminates external retrieval modules, and unlike in-context learning, its computational overhead scales linearly with KB size rather than quadratically. Our approach enables integrating a large KB of more than 10K triples into an 8B pre-trained LLM of only 8K context window on one single A100 80GB GPU and allows for dynamic updates without model fine-tuning or retraining. Experiments demonstrate KBLaM's effectiveness in various tasks, including question-answering and open-ended reasoning, while providing interpretable insights into its use of the augmented knowledge. Code and datasets are available at https://github.com/microsoft/KBLaM/

Problem

Research questions and friction points this paper is trying to address.

Augmenting LLMs with external knowledge

Eliminating external retrieval modules

Scaling linearly with KB size

Innovation

Methods, ideas, or system contributions that make the work stand out.

Knowledge Base integration

Rectangular attention mechanism

Dynamic KB updates

🔎 Similar Papers

Large Language Model Enhanced Knowledge Representation Learning: A Survey