Low-degree Lower bounds for clustering in moderate dimension

📅 2026-02-26
📈 Citations: 0
Influential: 0
📄 PDF

career value

244K/year
🤖 AI Summary
This work investigates the minimum mean separation $\Delta$ required for partial recovery of cluster structure in isotropic Gaussian mixture models in the moderate-dimensional regime ($n \geq dK$), aiming to bridge the gap between information-theoretic limits and the performance of polynomial-time algorithms. By establishing the first low-degree polynomial computational lower bound in this regime, the study reveals that the clustering hardness stems from nonparametric effects rather than dimensionality reduction bottlenecks. Leveraging this insight, the authors design a novel non-spectral clustering algorithm that matches the derived nonparametric rate and achieves the computational lower bound. This result precisely characterizes the statistical-computational trade-off in moderate-dimensional clustering and advances the theoretical understanding of its computational complexity.

Technology Category

Application Category

📝 Abstract
We study the fundamental problem of clustering $n$ points into $K$ groups drawn from a mixture of isotropic Gaussians in $\mathbb{R}^d$. Specifically, we investigate the requisite minimal distance $Δ$ between mean vectors to partially recover the underlying partition. While the minimax-optimal threshold for $Δ$ is well-established, a significant gap exists between this information-theoretic limit and the performance of known polynomial-time procedures. Although this gap was recently characterized in the high-dimensional regime ($n \leq dK$), it remains largely unexplored in the moderate-dimensional regime ($n \geq dK$). In this manuscript, we address this regime by establishing a new low-degree polynomial lower bound for the moderate-dimensional case when $d \geq K$. We show that while the difficulty of clustering for $n \leq dK$ is primarily driven by dimension reduction and spectral methods, the moderate-dimensional regime involves more delicate phenomena leading to a "non-parametric rate". We provide a novel non-spectral algorithm matching this rate, shedding new light on the computational limits of the clustering problem in moderate dimension.
Problem

Research questions and friction points this paper is trying to address.

clustering
moderate dimension
Gaussian mixture
computational lower bounds
minimax threshold
Innovation

Methods, ideas, or system contributions that make the work stand out.

low-degree polynomial
moderate dimension
clustering
non-parametric rate
computational lower bound
🔎 Similar Papers
2023-03-10Journal of ClassificationCitations: 10
2021-06-14IEEE Transactions on Visualization and Computer GraphicsCitations: 12