Approximate Reverse $k$-Ranks Queries in High Dimensions

📅 2025-04-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper studies the approximate reverse $k$-ranking query problem in high-dimensional spaces: given a query item vector $mathbf{q}$, a set of user vectors $mathbf{U}$, and a set of item vectors $mathbf{P}$, efficiently retrieve the $k$ users $mathbf{u} in mathbf{U}$ for which $mathbf{q}$ ranks highest among $mathbf{P}$ under inner-product scoring with respect to $mathbf{u}$. We formally define this problem and propose the first lightweight algorithmic framework with theoretical guarantees, integrating inner-product indexing, adaptive pruning, and probabilistic error control. Our theoretical analysis establishes sublinear query complexity, overcoming the computational bottleneck of exact ranking computation in high dimensions. Experiments on million-scale high-dimensional datasets demonstrate over 10× speedup versus state-of-the-art baselines, with recall exceeding 95% and controllable approximation accuracy.

Technology Category

Application Category

📝 Abstract
Many objects are represented as high-dimensional vectors nowadays. In this setting, the relevance between two objects (vectors) is usually evaluated by their inner product. Recently, item-centric searches, which search for users relevant to query items, have received attention and find important applications, such as product promotion and market analysis. To support these applications, this paper considers reverse $k$-ranks queries. Given a query vector $mathbf{q}$, $k$, a set $mathbf{U}$ of user vectors, and a set $mathbf{P}$ of item vectors, this query retrieves the $k$ user vectors $mathbf{u} in mathbf{U}$ with the highest $r(mathbf{q},mathbf{u},mathbf{P})$, where $r(mathbf{q},mathbf{u},mathbf{P})$ shows the rank of $mathbf{q}$ for $mathbf{u}$ among $mathbf{P}$. Because efficiently computing the exact answer for this query is difficult in high dimensions, we address the problem of approximate reverse $k$-ranks queries. Informally, given an approximation factor $c$, this problem allows, as an output, a user $mathbf{u}'$ such that $r(mathbf{q},mathbf{u}',mathbf{P})> au$ but $r(mathbf{q},mathbf{u}',mathbf{P}) leq c imes au$, where $ au$ is the rank threshold for the exact answer. We propose a new algorithm for solving this problem efficiently. Through theoretical and empirical analyses, we confirm the efficiency and effectiveness of our algorithm.
Problem

Research questions and friction points this paper is trying to address.

Efficiently solving approximate reverse k-ranks queries
Handling high-dimensional vector data for relevance ranking
Improving item-centric searches for applications like product promotion
Innovation

Methods, ideas, or system contributions that make the work stand out.

Approximate reverse k-ranks queries algorithm
Handles high-dimensional vector data efficiently
Uses approximation factor c for scalability
Daichi Amagata
Daichi Amagata
The University of Osaka & Nagoya University
clusteringoutlier detectionspatio-temporal databasesmulti-dimensional databasesdata stream
K
Kazuyoshi Aoyama
The University of Osaka, Suita, Osaka, Japan
K
Keito Kido
The University of Osaka, Suita, Osaka, Japan
S
Sumio Fujita
LY Corporation, Chiyoda, Tokyo, Japan