Exact Nearest-Neighbor Search on Energy-Efficient FPGA Devices

πŸ“… 2025-10-19
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the low energy efficiency and high latency of exact k-nearest neighbor (k-NN) search in high-dimensional latent spaces on FPGAs. We propose two large-scale, green retrieval schemes tailored for neural encoder representations, sharing a unified FPGA hardware architecture. One scheme prioritizes throughput, the other latencyβ€”both integrating batch-level parallelism and in-memory query parallelism to jointly support streaming and resident data scenarios. Experiments demonstrate that, while maintaining 100% recall, our methods achieve up to 16.6Γ— higher throughput and optimal latency compared to state-of-the-art CPU implementations, along with 11.9Γ— lower energy consumption. The core contribution is the first FPGA-based k-NN hardware acceleration framework that simultaneously guarantees lossless accuracy, adaptive deployment across diverse retrieval scenarios, and co-optimized energy efficiency and real-time performance.

Technology Category

Application Category

πŸ“ Abstract
This paper investigates the usage of FPGA devices for energy-efficient exact kNN search in high-dimension latent spaces. This work intercepts a relevant trend that tries to support the increasing popularity of learned representations based on neural encoder models by making their large-scale adoption greener and more inclusive. The paper proposes two different energy-efficient solutions adopting the same FPGA low-level configuration. The first solution maximizes system throughput by processing the queries of a batch in parallel over a streamed dataset not fitting into the FPGA memory. The second minimizes latency by processing each kNN incoming query in parallel over an in-memory dataset. Reproducible experiments on publicly available image and text datasets show that our solution outperforms state-of-the-art CPU-based competitors regarding throughput, latency, and energy consumption. Specifically, experiments show that the proposed FPGA solutions achieve the best throughput in terms of queries per second and the best-observed latency with scale-up factors of up to 16.6X. Similar considerations can be made regarding energy efficiency, where results show that our solutions can achieve up to 11.9X energy saving w.r.t. strong CPU-based competitors.
Problem

Research questions and friction points this paper is trying to address.

Enabling energy-efficient exact kNN search in high-dimensional latent spaces
Optimizing throughput via parallel batch processing on streamed datasets
Minimizing latency through parallel query processing on in-memory datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

FPGA devices enable energy-efficient exact kNN search
Parallel batch processing maximizes throughput on streamed data
In-memory parallel query processing minimizes latency for kNN
πŸ”Ž Similar Papers
No similar papers found.