OrchANN: A Unified I/O Orchestration Framework for Skewed Out-of-Core Vector Search

📅 2025-12-28

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

To address the severe performance degradation in billion-scale, semantically skewed embedding approximate nearest neighbor search (ANNS) on SSDs—caused by I/O bottlenecks—this paper proposes an end-to-end I/O-coordinated optimization framework. Our method introduces three core innovations: (1) automated selection of heterogeneous local indexes, dynamically adapting to varying cluster sizes; (2) query-aware dynamic navigation graphs that replace static routing to minimize erroneous partition probing; and (3) geometry-constrained, dual-granularity pruning—operating jointly on clusters and individual vectors—to eliminate “fetch-to-discard” reordering overhead. The system integrates offline performance profiling, memory-adaptive graph indexing, and a unified I/O orchestration engine. Evaluated on five standard benchmarks, our approach achieves up to 17.2× higher QPS and 25.0× lower latency versus state-of-the-art baselines (e.g., DiskANN), with significantly reduced SSD reads and zero accuracy loss.

Technology Category

Application Category

📝 Abstract

Approximate nearest neighbor search (ANNS) at billion scale is fundamentally an out-of-core problem: vectors and indexes live on SSD, so performance is dominated by I/O rather than compute. Under skewed semantic embeddings, existing out-of-core systems break down: a uniform local index mismatches cluster scales, static routing misguides queries and inflates the number of probed partitions, and pruning is incomplete at the cluster level and lossy at the vector level, triggering "fetch-to-discard" reranking on raw vectors. We present OrchANN, an out-of-core ANNS engine that uses an I/O orchestration model for unified I/O governance along the route-access-verify pipeline. OrchANN selects a heterogeneous local index per cluster via offline auto-profiling, maintains a query-aware in-memory navigation graph that adapts to skewed workloads, and applies multi-level pruning with geometric bounds to filter both clusters and vectors before issuing SSD reads. Across five standard datasets under strict out-of-core constraints, OrchANN outperforms four baselines including DiskANN, Starling, SPANN, and PipeANN in both QPS and latency while reducing SSD accesses. Furthermore, OrchANN delivers up to 17.2x higher QPS and 25.0x lower latency than competing systems without sacrificing accuracy.

Problem

Research questions and friction points this paper is trying to address.

OrchANN addresses skewed out-of-core vector search performance issues

It solves mismatched local indexes and static routing in existing systems

The framework reduces SSD accesses via multi-level pruning and adaptive navigation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Heterogeneous local index per cluster via offline auto-profiling

Query-aware in-memory navigation graph adapting to skewed workloads

Multi-level pruning with geometric bounds before SSD reads

🔎 Similar Papers

No similar papers found.