pHNSW: PCA-Based Filtering to Accelerate HNSW Approximate Nearest Neighbor Search

📅 2026-02-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the computational and memory bottlenecks of the Hierarchical Navigable Small World (HNSW) algorithm in high-dimensional approximate nearest neighbor search, which suffers from high computational overhead, irregular memory access patterns, and substantial bandwidth demands. To overcome these challenges, the authors propose a co-optimized algorithm-hardware solution: at the algorithmic level, they introduce PCA-based dimensionality reduction as a filtering step—applied for the first time in this context—to significantly reduce distance computations and graph traversal costs; at the hardware level, they design a dedicated pHNSW processor featuring a custom instruction set and interfaces to both DDR4 and HBM1.0 memory. Implemented in 65nm CMOS technology, the proposed system achieves 14.47–21.37× higher queries per second (QPS) on CPU and 5.37–8.46× on GPU compared to standard HNSW, while reducing energy consumption by up to 57.4%.

Technology Category

Application Category

📝 Abstract
Hierarchical Navigable Small World (HNSW) has demonstrated impressive accuracy and low latency for high-dimensional nearest neighbor searches. However, its high computational demands and irregular, large-volume data access patterns present significant challenges to search efficiency. To address these challenges, we introduce pHNSW, an algorithm-hardware co-optimized solution that accelerates HNSW through Principal Component Analysis (PCA) filtering. On the algorithm side, we apply PCA filtering to reduce the dimensionality of the dataset, thereby lowering the volume of neighbor access and decreasing the computational load for distance calculations. On the hardware side, we design the pHNSW processor with custom instructions to optimize search throughput and energy efficiency. In the experiments, we synthesized the pHNSW processor RTL design with a 65nm technology node and evaluated it using DDR4 and HBM1.0 DRAM standards. The results show that pHNSW boosts Queries per Second (QPS) by 14.47x-21.37x on a CPU and 5.37x-8.46x on a GPU, while reducing energy consumption by up to 57.4% compared to standard HNSW implementation.
Problem

Research questions and friction points this paper is trying to address.

HNSW
approximate nearest neighbor search
high-dimensional data
search efficiency
computational overhead
Innovation

Methods, ideas, or system contributions that make the work stand out.

PCA filtering
HNSW acceleration
algorithm-hardware co-design
approximate nearest neighbor search
energy-efficient processor
🔎 Similar Papers
No similar papers found.