Gemini Embedding: Generalizable Embeddings from Gemini

📅 2025-03-10

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

This work addresses the challenge of developing a highly general-purpose multilingual and multimodal text embedding representation capable of uniformly supporting diverse downstream tasks—including classification, retrieval, clustering, similarity estimation, and ranking. To this end, we propose the first Gemini-based universal embedding system, which leverages prompt engineering, representation distillation, multi-task supervised fine-tuning, and large-scale multilingual contrastive learning to transform Gemini’s strong multilingual and code comprehension capabilities into pre-computable, plug-and-play embedding vectors. The system supports over 250 languages and code corpora. Evaluated on the MMTEB benchmark—comprising 100+ tasks—it consistently outperforms domain-specific models across languages, tasks, and modalities, achieving state-of-the-art performance in a unified setting. This marks the first empirical validation of the feasibility and superiority of a single-large-model-driven universal embedding paradigm.

Technology Category

Application Category

📝 Abstract

In this report, we introduce Gemini Embedding, a state-of-the-art embedding model leveraging the power of Gemini, Google's most capable large language model. Capitalizing on Gemini's inherent multilingual and code understanding capabilities, Gemini Embedding produces highly generalizable embeddings for text spanning numerous languages and textual modalities. The representations generated by Gemini Embedding can be precomputed and applied to a variety of downstream tasks including classification, similarity, clustering, ranking, and retrieval. Evaluated on the Massive Multilingual Text Embedding Benchmark (MMTEB), which includes over one hundred tasks across 250+ languages, Gemini Embedding substantially outperforms prior state-of-the-art models, demonstrating considerable improvements in embedding quality. Achieving state-of-the-art performance across MMTEB's multilingual, English, and code benchmarks, our unified model demonstrates strong capabilities across a broad selection of tasks and surpasses specialized domain-specific models.

Problem

Research questions and friction points this paper is trying to address.

Develops generalizable embeddings for multilingual and multimodal text.

Enhances performance in classification, similarity, clustering, ranking, and retrieval tasks.

Outperforms existing models on multilingual and code benchmarks.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages Gemini for multilingual embeddings

Precomputable embeddings for diverse tasks

Outperforms benchmarks in multilingual, English, code

🔎 Similar Papers

No similar papers found.