๐ค AI Summary
In multi-vector databases, systematic index tuning methodologies are lacking for multimodal or multi-feature scenarios, leading to high query latency and suboptimal trade-offs among storage cost, recall, and efficiency.
Method: This paper formally defines the multi-vector search index tuning problem and proposes a holistic framework that jointly optimizes query latency, storage overhead, and recallโdeparting from conventional single-vector or relational indexing paradigms. It introduces a workload-driven search space pruning algorithm, a multi-objective constrained modeling mechanism, and an efficient index evaluator.
Contribution/Results: Evaluated on real-world multi-vector workloads, our approach reduces query latency by 2.1รโ8.3ร over state-of-the-art baselines while satisfying user-specified storage and recall constraints. It identifies Pareto-optimal index configurations, enabling principled, workload-aware index selection in multi-vector settings.
๐ Abstract
Vector search plays a crucial role in many real-world applications. In addition to single-vector search, multi-vector search becomes important for multi-modal and multi-feature scenarios today. In a multi-vector database, each row is an item, each column represents a feature of items, and each cell is a high-dimensional vector. In multi-vector databases, the choice of indexes can have a significant impact on performance. Although index tuning for relational databases has been extensively studied, index tuning for multi-vector search remains unclear and challenging. In this paper, we define multi-vector search index tuning and propose a framework to solve it. Specifically, given a multi-vector search workload, we develop algorithms to find indexes that minimize latency and meet storage and recall constraints. Compared to the baseline, our latency achieves 2.1X to 8.3X speedup.