Ranking Vectors Clustering: Theory and Applications

📅 2025-07-16

📈 Citations: 0

✨ Influential: 0

career value

232K/year

🤖 AI Summary

This paper studies the *k-centroids ranking vector clustering* (KRC) problem: given a set of permutation vectors representing preference orders, partition them into *k* clusters and compute a cluster centroid that satisfies permutation constraints. The problem is proven NP-hard. To address it, we make three core contributions: (1) We derive a closed-form solution for the optimal centroid of a single cluster, computable in linear time; (2) We propose KRCA, an efficient approximation algorithm, and a decision-tree-accelerated branch-and-bound method, both integrated with a *k*-means–style initialization and iterative refinement strategy; (3) We establish a theoretical error bound and empirically demonstrate—on both synthetic and real-world datasets—that our methods significantly outperform existing baselines in clustering quality and runtime efficiency. The proposed framework is particularly suitable for personalized recommendation and large-scale group decision-making applications.

Technology Category

Application Category

📝 Abstract

We study the problem of clustering ranking vectors, where each vector represents preferences as an ordered list of distinct integers. Specifically, we focus on the k-centroids ranking vectors clustering problem (KRC), which aims to partition a set of ranking vectors into k clusters and identify the centroid of each cluster. Unlike classical k-means clustering (KMC), KRC constrains both the observations and centroids to be ranking vectors. We establish the NP-hardness of KRC and characterize its feasible set. For the single-cluster case, we derive a closed-form analytical solution for the optimal centroid, which can be computed in linear time. To address the computational challenges of KRC, we develop an efficient approximation algorithm, KRCA, which iteratively refines initial solutions from KMC, referred to as the baseline solution. Additionally, we introduce a branch-and-bound (BnB) algorithm for efficient cluster reconstruction within KRCA, leveraging a decision tree framework to reduce computational time while incorporating a controlling parameter to balance solution quality and efficiency. We establish theoretical error bounds for KRCA and BnB. Through extensive numerical experiments on synthetic and real-world datasets, we demonstrate that KRCA consistently outperforms baseline solutions, delivering significant improvements in solution quality with fast computational times. This work highlights the practical significance of KRC for personalization and large-scale decision making, offering methodological advancements and insights that can be built upon in future studies.

Problem

Research questions and friction points this paper is trying to address.

Clustering ranking vectors with ordered preferences

Solving NP-hard k-centroids ranking clustering problem

Developing efficient approximation and BnB algorithms

Innovation

Methods, ideas, or system contributions that make the work stand out.

Closed-form solution for optimal centroid

Approximation algorithm KRCA refines KMC

Branch-and-bound accelerates cluster reconstruction

🔎 Similar Papers

Categorical data clustering: 25 years beyond K-modes