KBLaM: Knowledge Base augmented Language Model

πŸ“… 2024-10-14
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address the limitations of external retrieval dependency, high computational overhead, and infrequent knowledge updates in large language model (LLM) knowledge enhancement, this paper proposes KBLaMβ€”a retrieval-free, fine-tuning-free knowledge injection framework. KBLaM encodes structured knowledge bases (>10K triples) into continuous key-value vector pairs and integrates them end-to-end into an 8B-parameter LLM via a customized rectangular attention mechanism, enabling dynamic, single-GPU knowledge loading and real-time updates. Its computational complexity scales linearly with knowledge base size and supports interpretable tracking of knowledge usage. Experiments demonstrate that KBLaM significantly outperforms retrieval-augmented generation (RAG) and in-context learning (ICL) baselines on question answering and open-ended reasoning tasks. Deployed on a single A100 GPU, it achieves low latency and high interpretability, establishing the first efficient, scalable, plug-and-play paradigm for internalizing structured knowledge into LLMs.

Technology Category

Application Category

πŸ“ Abstract
In this paper, we propose Knowledge Base augmented Language Model (KBLaM), a new method for augmenting Large Language Models (LLMs) with external knowledge. KBLaM works with a knowledge base (KB) constructed from a corpus of documents, transforming each piece of knowledge in the KB into continuous key-value vector pairs via pre-trained sentence encoders with linear adapters and integrating them into pre-trained LLMs via a specialized rectangular attention mechanism. Unlike Retrieval-Augmented Generation, KBLaM eliminates external retrieval modules, and unlike in-context learning, its computational overhead scales linearly with KB size rather than quadratically. Our approach enables integrating a large KB of more than 10K triples into an 8B pre-trained LLM of only 8K context window on one single A100 80GB GPU and allows for dynamic updates without model fine-tuning or retraining. Experiments demonstrate KBLaM's effectiveness in various tasks, including question-answering and open-ended reasoning, while providing interpretable insights into its use of the augmented knowledge. Code and datasets are available at https://github.com/microsoft/KBLaM/
Problem

Research questions and friction points this paper is trying to address.

Augmenting LLMs with external knowledge
Eliminating external retrieval modules
Scaling linearly with KB size
Innovation

Methods, ideas, or system contributions that make the work stand out.

Knowledge Base integration
Rectangular attention mechanism
Dynamic KB updates
πŸ”Ž Similar Papers
No similar papers found.
X
Xi Wang
University of Massachusetts, Amherst
L
Liana Mikaelyan
Microsoft
T
Taketomo Isazawa
Microsoft Research
James Hensman
James Hensman
Microsoft Research
machine learningprobabilistic modellingbiostatisticsGaussian processesapproximate inference