Bang for the Buck: Vector Search on Cloud CPUs

📅 2025-05-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Heterogeneous CPU architectures in cloud environments lack systematic, cross-vendor cost-performance evaluations for vector search, hindering optimal hardware selection. Method: We introduce the first comprehensive benchmarking framework for vector search across multi-generation CPUs—AMD (Zen 4), Intel (Sapphire Rapids), and AWS Graviton (v3/v4)—built upon Faiss and Annoy. It rigorously evaluates IVF and HNSW index structures under float32 and int8 quantization, measuring queries-per-second (QPS) and cost-efficiency (queries-per-dollar, QP$). Contribution/Results: Our analysis uncovers asymmetric microarchitectural impacts: Zen 4 achieves up to 3× higher QPS than Sapphire Rapids on IVF, yet underperforms on HNSW; Graviton3 attains the highest QP$ across most configurations—even surpassing Graviton4. These findings provide empirically grounded, reproducible guidance for hardware selection in cloud-based vector databases.

Technology Category

Application Category

📝 Abstract
Vector databases have emerged as a new type of systems that support efficient querying of high-dimensional vectors. Many of these offer their database as a service in the cloud. However, the variety of available CPUs and the lack of vector search benchmarks across CPUs make it difficult for users to choose one. In this study, we show that CPU microarchitectures available in the cloud perform significantly differently across vector search scenarios. For instance, in an IVF index on float32 vectors, AMD's Zen4 gives almost 3x more queries per second (QPS) compared to Intel's Sapphire Rapids, but for HNSW indexes, the tables turn. However, when looking at the number of queries per dollar (QP$), Graviton3 is the best option for most indexes and quantization settings, even over Graviton4 (Table 1). With this work, we hope to guide users in getting the best"bang for the buck"when deploying vector search systems.
Problem

Research questions and friction points this paper is trying to address.

Evaluating CPU performance differences in cloud-based vector search scenarios
Comparing query efficiency (QPS) across AMD, Intel, and Graviton processors
Identifying cost-effective (QP$) CPU options for vector database deployments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Benchmarking CPU performance for vector search
Comparing QPS across different CPU architectures
Optimizing queries per dollar for cost efficiency