Do you know what q-means?

📅 2023-08-18
🏛️ arXiv.org
📈 Citations: 3
Influential: 0
📄 PDF
🤖 AI Summary
Existing quantum k-means algorithms suffer from low efficiency on large-scale datasets and heavy reliance on quantum linear algebra subroutines. Method: This paper proposes a novel q-means algorithm achieving ε-approximate k-means clustering, eliminating quantum linear algebra entirely. It leverages only QRAM-based quantum state preparation and multivariate quantum amplitude estimation, and introduces the first accompanying “dequantized” classical algorithm. Results: The quantum version achieves time complexity Õ(‖V‖_F/√n · k^{5/2}d/ε(√k + log n)), while the classical counterpart attains O(‖V‖_F²/n · k²/ε²(kd + log n)). Both scale logarithmically in n—the dataset size—matching state-of-the-art accuracy and efficiency. Key contributions include: (i) the first q-means framework requiring no quantum linear algebra primitives; and (ii) the first dequantized classical algorithm with provably identical theoretical guarantees—namely, ε-approximation and comparable runtime dependence on k, d, and n.
📝 Abstract
Clustering is one of the most important tools for analysis of large datasets, and perhaps the most popular clustering algorithm is Lloyd's iteration for $k$-means. This iteration takes $n$ vectors $V=[v_1,dots,v_n]inmathbb{R}^{n imes d}$ and outputs $k$ centroids $c_1,dots,c_kinmathbb{R}^d$; these partition the vectors into clusters based on which centroid is closest to a particular vector. We present an overall improved version of the"$q$-means"algorithm, the quantum algorithm originally proposed by Kerenidis, Landman, Luongo, and Prakash (NeurIPS'19) which performs $varepsilon$-$k$-means, an approximate version of $k$-means clustering. Our algorithm does not rely on quantum linear algebra primitives of prior work, but instead only uses QRAM to prepare simple states based on the current iteration's clusters and multivariate quantum amplitude estimation. The time complexity is $widetilde{O}ig(frac{|V|_F}{sqrt{n}}frac{k^{5/2}d}{varepsilon}(sqrt{k} + log{n})ig)$ and maintains the logarithmic dependence on $n$ while improving the dependence on most of the other parameters. We also present a"dequantized"algorithm for $varepsilon$-$k$-means which runs in $Oig(frac{|V|_F^2}{n}frac{k^{2}}{varepsilon^2}(kd + log{n})ig)$ time. Notably, this classical algorithm matches the logarithmic dependence on $n$ attained by the quantum algorithm.
Problem

Research questions and friction points this paper is trying to address.

Improves quantum q-means algorithm for clustering large datasets.
Reduces reliance on quantum linear algebra primitives.
Introduces classical algorithm matching quantum efficiency.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Improved quantum q-means algorithm without quantum linear algebra
Uses QRAM and multivariate quantum amplitude estimation
Dequantized classical algorithm matches quantum logarithmic dependence
🔎 Similar Papers
No similar papers found.
J
J. F. Doriguello
Centre for Quantum Technologies, National University of Singapore, Singapore
Alessandro Luongo
Alessandro Luongo
Centre for Quantum Technologies
quantum machine learningquantum algorithms
Ewin Tang
Ewin Tang
University of California at Berkeley