LEANN: A Low-Storage Vector Index

📅 2025-06-09

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

To address the excessive storage overhead (1.5–7× the original data size) and deployment challenges of embedding indices for on-device vector retrieval, this paper proposes a lightweight indexing method that synergistically combines graph-structure compression with online dynamic recomputation. Our approach introduces a compact graph-based index structure, integrated with online embedding re-encoding and a lightweight approximate nearest neighbor (ANN) search algorithm, achieving a favorable trade-off between accuracy and latency under extreme memory constraints. Experimental results demonstrate that our index reduces storage to ≤5% of the original data size—cutting memory usage by up to 50× compared to conventional methods—while attaining 90% top-3 recall within 2 seconds on real-world question-answering tasks. This advancement significantly enhances the feasibility and practicality of deploying high-performance vector retrieval directly on resource-constrained edge devices.

Technology Category

Application Category

📝 Abstract

Embedding-based search is widely used in applications such as recommendation and retrieval-augmented generation (RAG). Recently, there is a growing demand to support these capabilities over personal data stored locally on devices. However, maintaining the necessary data structure associated with the embedding-based search is often infeasible due to its high storage overhead. For example, indexing 100 GB of raw data requires 150 to 700 GB of storage, making local deployment impractical. Reducing this overhead while maintaining search quality and latency becomes a critical challenge. In this paper, we present LEANN, a storage-efficient approximate nearest neighbor (ANN) search index optimized for resource-constrained personal devices. LEANN combines a compact graph-based structure with an efficient on-the-fly recomputation strategy to enable fast and accurate retrieval with minimal storage overhead. Our evaluation shows that LEANN reduces index size to under 5% of the original raw data, achieving up to 50 times smaller storage than standard indexes, while maintaining 90% top-3 recall in under 2 seconds on real-world question answering benchmarks.

Problem

Research questions and friction points this paper is trying to address.

Reducing storage overhead for embedding-based search on devices

Maintaining search quality with minimal storage requirements

Enabling efficient local deployment of ANN search indexes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Compact graph-based structure for efficient search

On-the-fly recomputation strategy to minimize storage

Reduces index size to under 5% of raw data

🔎 Similar Papers

The Faiss library