Probabilistic Kernel Function for Fast Angle Testing

📅 2025-05-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the angular similarity evaluation problem in high-dimensional Euclidean spaces. We propose two deterministic projection-based randomized kernel functions—specifically designed for angular comparison and angular thresholding—departing from conventional Gaussian random projections. Our approach introduces reference-angle-guided deterministic projections, eliminating reliance on asymptotic assumptions (e.g., infinitely many projections) while ensuring theoretical soundness and empirical superiority. The core methodology integrates probabilistic kernel design, angularly sensitive hashing, and approximate nearest neighbor search (ANNS) optimization. In ANNS benchmarks, our method achieves 2.5×–3× higher query throughput (QPS) compared to HNSW, significantly accelerating angular similarity retrieval. Theoretical analysis guarantees bounded approximation error, and extensive experiments validate robust performance across diverse high-dimensional datasets.

Technology Category

Application Category

📝 Abstract
In this paper, we study the angle testing problem in high-dimensional Euclidean spaces and propose two projection-based probabilistic kernel functions, one designed for angle comparison and the other for angle thresholding. Unlike existing approaches that rely on random projection vectors drawn from Gaussian distributions, our approach leverages reference angles and employs a deterministic structure for the projection vectors. Notably, our kernel functions do not require asymptotic assumptions, such as the number of projection vectors tending to infinity, and can be both theoretically and experimentally shown to outperform Gaussian-distribution-based kernel functions. We further apply the proposed kernel function to Approximate Nearest Neighbor Search (ANNS) and demonstrate that our approach achieves a 2.5X ~ 3X higher query-per-second (QPS) throughput compared to the state-of-the-art graph-based search algorithm HNSW.
Problem

Research questions and friction points this paper is trying to address.

Proposes probabilistic kernel functions for high-dimensional angle testing
Eliminates asymptotic assumptions in existing angle comparison methods
Improves ANNS query speed by 2.5X-3X versus HNSW algorithm
Innovation

Methods, ideas, or system contributions that make the work stand out.

Deterministic projection vectors using reference angles
No asymptotic assumptions for kernel functions
Higher QPS throughput in ANNS applications
🔎 Similar Papers
No similar papers found.