Exploring Distributed Vector Databases Performance on HPC Platforms: A Study with Qdrant

📅 2025-09-15

📈 Citations: 0

✨ Influential: 0

career value

240K/year

🤖 AI Summary

There is a lack of systematic empirical characterization of distributed vector database performance on HPC platforms—particularly in scientific computing scenarios. Method: This paper presents the first large-scale empirical evaluation of Qdrant on the Polaris supercomputer, using Qwen3-Embedding-4B–generated embeddings and the BV-BRC biomedical text workload to quantitatively measure insertion latency, index construction time, and query latency at scale (32 nodes). Contribution/Results: The study identifies critical bottlenecks—including network communication overhead, I/O scheduling inefficiencies, and GPU-CPU coordination constraints—and proposes HPC-aware configuration optimizations for vector databases. It delivers the first publicly available, reproducible benchmark for distributed vector databases in AI-native scientific computing environments, providing actionable insights and performance baselines for designing cross-architecture vector retrieval systems.

Technology Category

Application Category

📝 Abstract

Vector databases have rapidly grown in popularity, enabling efficient similarity search over data such as text, images, and video. They now play a central role in modern AI workflows, aiding large language models by grounding model outputs in external literature through retrieval-augmented generation. Despite their importance, little is known about the performance characteristics of vector databases in high-performance computing (HPC) systems that drive large-scale science. This work presents an empirical study of distributed vector database performance on the Polaris supercomputer in the Argonne Leadership Computing Facility. We construct a realistic biological-text workload from BV-BRC and generate embeddings from the peS2o corpus using Qwen3-Embedding-4B. We select Qdrant to evaluate insertion, index construction, and query latency with up to 32 workers. Informed by practical lessons from our experience, this work takes a first step toward characterizing vector database performance on HPC platforms to guide future research and optimization.

Problem

Research questions and friction points this paper is trying to address.

Evaluating distributed vector database performance on HPC systems

Assessing Qdrant's scalability with biological text workloads

Measuring insertion, indexing, and query latency at scale

Innovation

Methods, ideas, or system contributions that make the work stand out.

Distributed vector database Qdrant evaluation

Performance testing on Polaris supercomputer

Embedding generation with Qwen3-4B model

🔎 Similar Papers

When Large Language Models Meet Vector Databases: A Survey