TRIM: Accelerating High-Dimensional Vector Similarity Search with Enhanced Triangle-Inequality-Based Pruning

📅 2025-08-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
High-dimensional vector similarity search (HVSS) suffers from the “curse of dimensionality,” particularly distance concentration, which renders conventional triangle-inequality-based lower-bound pruning ineffective in high dimensions. To address this, we propose TRIM—a novel pruning enhancement method that jointly optimizes landmark vector selection and introduces a tunable relaxation factor, thereby significantly improving lower-bound tightness and overcoming the degradation of pruning efficacy. TRIM is fully compatible with both in-memory (e.g., HNSW, IVFPQ) and disk-based (e.g., DiskANN) retrieval frameworks without requiring modifications to underlying index structures. Experiments demonstrate that TRIM achieves 99% pruning rates in memory-resident settings, accelerating graph-based search by 90% and quantization-based search by 200%; in disk-resident settings, it reduces I/O costs by 58% and improves end-to-end query latency by 102%, all while preserving retrieval accuracy. Our key contribution is the first systematic solution to triangle-inequality pruning failure in high dimensions—delivering efficient, framework-agnostic, and accuracy-preserving pruning enhancement.

Technology Category

Application Category

📝 Abstract
High-dimensional vector similarity search (HVSS) is critical for many data processing and AI applications. However, traditional HVSS methods often require extensive data access for distance calculations, leading to inefficiencies. Triangle-inequality-based lower bound pruning is a widely used technique to reduce the number of data access in low-dimensional spaces but becomes less effective in high-dimensional settings. This is attributed to the "distance concentration" phenomenon, where the lower bounds derived from the triangle inequality become too small to be useful. To address this, we propose TRIM, which enhances the effectiveness of traditional triangle-inequality-based pruning in high-dimensional vector similarity search using two key ways: (1) optimizing landmark vectors used to form the triangles, and (2) relaxing the lower bounds derived from the triangle inequality, with the relaxation degree adjustable according to user's needs. TRIM is a versatile operation that can be seamlessly integrated into both memory-based (e.g., HNSW, IVFPQ) and disk-based (e.g., DiskANN) HVSS methods, reducing distance calculations and disk access. Extensive experiments show that TRIM enhances memory-based methods, improving graph-based search by up to 90% and quantization-based search by up to 200%, while achieving a pruning ratio of up to 99%. It also reduces I/O costs by up to 58% and improves efficiency by 102% for disk-based methods, while preserving high query accuracy.
Problem

Research questions and friction points this paper is trying to address.

Enhancing triangle-inequality pruning for high-dimensional vector search
Addressing distance concentration in similarity search inefficiency
Reducing data access and distance calculations in HVSS
Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimizing landmark vectors for triangle formation
Relaxing lower bounds with adjustable degree
Seamlessly integrating into memory and disk methods
🔎 Similar Papers
No similar papers found.
Y
Yitong Song
Shanghai Jiao Tong University
Pengcheng Zhang
Pengcheng Zhang
Beihang University
computer vision
C
Chao Gao
Zilliz
B
Bin Yao
Shanghai Jiao Tong University
K
Kai Wang
Shanghai Jiao Tong University
Z
Zongyuan Wu
Alibaba Group
L
Lin Qu
Taobao