Granite Embedding Models

📅 2025-02-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the longstanding trade-off between efficiency and effectiveness in multilingual information retrieval (IR) embedding models. We introduce the Granite family of embedding models—comprising 12-layer dense/sparse base models and 6-layer distilled lightweight variants—supporting both English and multilingual retrieval. Our approach innovatively integrates retrieval-oriented pretraining, contrastive fine-tuning, knowledge distillation, and model ensembling, jointly optimized on enterprise-grade, high-quality, compliant data to balance accuracy, inference efficiency, and regulatory requirements. On IBM’s internal retrieval benchmarks, Granite significantly outperforms open-source models of comparable size; it achieves state-of-the-art (SOTA) results on standard IR benchmarks including MSMARCO and BEIR. All Granite models are released under the Apache 2.0 license, providing an efficient, production-ready, and trustworthy open-source foundation for industrial multilingual IR systems.

Technology Category

Application Category

📝 Abstract
We introduce the Granite Embedding models, a family of encoder-based embedding models designed for retrieval tasks, spanning dense-retrieval and sparse retrieval architectures, with both English and Multilingual capabilities. This report provides the technical details of training these highly effective 12 layer embedding models, along with their efficient 6 layer distilled counterparts. Extensive evaluations show that the models, developed with techniques like retrieval oriented pretraining, contrastive finetuning, knowledge distillation, and model merging significantly outperform publicly available models of similar sizes on both internal IBM retrieval and search tasks, and have equivalent performance on widely used information retrieval benchmarks, while being trained on high-quality data suitable for enterprise use. We publicly release all our Granite Embedding models under the Apache 2.0 license, allowing both research and commercial use at https://huggingface.co/collections/ibm-granite.
Problem

Research questions and friction points this paper is trying to address.

Develop embedding models for retrieval tasks
Compare performance on retrieval and search benchmarks
Release models for research and commercial use
Innovation

Methods, ideas, or system contributions that make the work stand out.

encoder-based embedding models
retrieval oriented pretraining techniques
multilingual dense-sparse retrieval architectures
🔎 Similar Papers
No similar papers found.