Vector Search for the Future: From Memory-Resident, Static Heterogeneous Storage, to Cloud-Native Architectures

📅 2026-01-05
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the performance, latency, scalability, and cost challenges posed by the exponential growth of vector data by systematically tracing the evolution of vector search technologies through the lens of storage architecture. It proposes a cloud-native three-tier storage framework—comprising memory, SSD, and object storage—that integrates in-memory indexing methods such as IVF, hashing, quantization, and graph-based indices with heterogeneous storage techniques including block-level data layout, I/O optimization, and efficient index updates. Furthermore, it introduces a cloud-native data tiering strategy to orchestrate data placement across storage layers. The resulting framework provides a comprehensive theoretical foundation and practical guidance for building trillion-scale vector retrieval systems that achieve high throughput, low latency, and cost efficiency, while also outlining key directions for future research.

Technology Category

Application Category

📝 Abstract
Vector search (VS) has become a fundamental component in multimodal data management, enabling core functionalities such as image, video, and code retrieval. As vector data scales rapidly, VS faces growing challenges in balancing search, latency, scalability, and cost. The evolution of VS has been closely driven by changes in storage architecture. Early VS methods rely on all-in-memory designs for low latency, but scalability is constrained by memory capacity and cost. To address this, recent research has adopted heterogeneous architectures that offload space-intensive vectors and index structures to SSDs, while exploiting block locality and I/O-efficient strategies to maintain high search performance at billion scale. Looking ahead, the increasing demand for trillion-scale vector retrieval and cloud-native elasticity is driving a further shift toward memory-SSD-object storage architectures, which enable cost-efficient data tiering and seamless scalability. In this tutorial, we review the evolution of VS techniques from a storage-architecture perspective. We first review memory-resident methods, covering classical IVF, hash, quantization, and graph-based designs. We then present a systematic overview of heterogeneous storage VS techniques, including their index designs, block-level layouts, query strategies, and update mechanisms. Finally, we examine emerging cloud-native systems and highlight open research opportunities for future large-scale vector retrieval systems.
Problem

Research questions and friction points this paper is trying to address.

vector search
scalability
latency
cloud-native
storage architecture
Innovation

Methods, ideas, or system contributions that make the work stand out.

vector search
heterogeneous storage
cloud-native architecture
storage tiering
billion-scale retrieval
🔎 Similar Papers
No similar papers found.