🤖 AI Summary
The Kunpeng 920 CPU—a high-performance ARM-based processor—lacks efficient vector search libraries, hindering its adoption for large-scale similarity search tasks.
Method: We design and implement the first deep-optimized, high-performance vector retrieval system tailored for the Kunpeng 920 platform. Our approach integrates hardware-aware SIMD acceleration, data prefetching, index structure reorganization, early termination, and vector quantization to fully exploit the chip’s many-core architecture and high memory bandwidth.
Contribution/Results: Compared to mainstream x86-based solutions (e.g., Faiss, DiskANN), our system achieves over 2× higher query throughput and supports tens of millions of queries per day. It has been deployed in multiple internal and external production services, demonstrating—for the first time—the competitiveness and engineering viability of ARM-based servers in large-scale vector search workloads.
📝 Abstract
Vector search, which returns the vectors most similar to a given query vector from a large vector dataset, underlies many important applications such as search, recommendation, and LLMs. To be economic, vector search needs to be efficient to reduce the resources required by a given query workload. However, existing vector search libraries (e.g., Faiss and DiskANN) are optimized for x86 CPU architectures (i.e., Intel and AMD CPUs) while Huawei Kunpeng CPUs are based on the ARM architecture and competitive in compute power. In this paper, we present KBest as a vector search library tailored for the latest Kunpeng 920 CPUs. To be efficient, KBest incorporates extensive hardware-aware and algorithmic optimizations, which include single-instruction-multiple-data (SIMD) accelerated distance computation, data prefetch, index refinement, early termination, and vector quantization. Experiment results show that KBest outperforms SOTA vector search libraries running on x86 CPUs, and our optimizations can improve the query throughput by over 2x. Currently, KBest serves applications from both our internal business and external enterprise clients with tens of millions of queries on a daily basis.