Cost-Effective, Low Latency Vector Search with Azure Cosmos DB

📅 2025-05-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Cloud-native transactional databases (e.g., Azure Cosmos DB) lack efficient vector search capabilities. Method: This work pioneers the native integration of DiskANN—a high-performance disk-based indexing library—into a NoSQL engine, enabling unified storage and strong consistency between vector indexes and primary data. We propose a distributed index synchronization protocol, adaptive sharding, and memory-mapped file optimizations. Contribution/Results: On 10M vectors, the system achieves <20 ms P95 latency, high recall, and stable update throughput. Compared to Zilliz and Pinecone Serverless, query cost is reduced by 15× and 41×, respectively, while supporting elastic scaling to 1B–10B vectors. Crucially, it retains cloud database advantages—including high availability, durability, and horizontal scalability—without requiring a dedicated vector database, establishing a new production-grade paradigm for semantic search that is low-cost, low-latency, and strongly consistent.

Technology Category

Application Category

📝 Abstract
Vector indexing enables semantic search over diverse corpora and has become an important interface to databases for both users and AI agents. Efficient vector search requires deep optimizations in database systems. This has motivated a new class of specialized vector databases that optimize for vector search quality and cost. Instead, we argue that a scalable, high-performance, and cost-efficient vector search system can be built inside a cloud-native operational database like Azure Cosmos DB while leveraging the benefits of a distributed database such as high availability, durability, and scale. We do this by deeply integrating DiskANN, a state-of-the-art vector indexing library, inside Azure Cosmos DB NoSQL. This system uses a single vector index per partition stored in existing index trees, and kept in sync with underlying data. It supports<20ms query latency over an index spanning 10 million of vectors, has stable recall over updates, and offers nearly 15x and 41x lower query cost compared to Zilliz and Pinecone serverless enterprise products. It also scales out to billions of vectors via automatic partitioning. This convergent design presents a point in favor of integrating vector indices into operational databases in the context of recent debates on specialized vector databases, and offers a template for vector indexing in other databases.
Problem

Research questions and friction points this paper is trying to address.

Enabling efficient semantic search via vector indexing in databases
Reducing query latency and cost in vector search systems
Integrating vector indices into operational databases for scalability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates DiskANN into Azure Cosmos DB
Uses single vector index per partition
Scales to billions via automatic partitioning
🔎 Similar Papers
No similar papers found.
N
Nitish Upreti
Microsoft
K
Krishnan Sundaram
Microsoft
H
Hari Sudan Sundar
Microsoft
S
Samer Boshra
Microsoft
B
Balachandar Perumalswamy
Microsoft
S
Shivam Atri
Microsoft
M
Martin Chisholm
Microsoft
R
Revti Raman Singh
Microsoft
Greg Yang
Greg Yang
Microsoft
S
Subramanyam Pattipaka
Microsoft
T
Tamara Hass
Microsoft
N
Nitesh Dudhey
Microsoft
J
James Codella
Microsoft
M
Mark Hildebrand
Microsoft
M
Magdalen Manohar
Microsoft
J
Jack Moffitt
Microsoft
H
Haiyang Xu
Microsoft
N
Naren Datha
Microsoft
S
Suryansh Gupta
Microsoft
Ravishankar Krishnaswamy
Ravishankar Krishnaswamy
Microsoft Research
Algorithms
Prashant Gupta
Prashant Gupta
Microsoft
Abhishek Sahu
Abhishek Sahu
Visiting Faculty, Niser
Algorithms
R
Ritika Mor
Microsoft
Santosh Kulkarni
Santosh Kulkarni
Microsoft
H
Hemeswari Varada
Microsoft
S
Sudhanshu Barthwal
Microsoft
A
Amar Sagare
Microsoft
D
Dinesh Billa
Microsoft
Z
Zishan Fu
Microsoft
N
Neil Deshpande
Microsoft
Shaun Cooper
Shaun Cooper
Microsoft
K
Kevin Pilch
Microsoft
S
Simon Moreno
Microsoft
A
Aayush Kataria
Microsoft
V
Vipul Vishal
Microsoft
Harsha Vardhan Simhadri
Harsha Vardhan Simhadri
Microsoft Research