🤖 AI Summary
This work systematically investigates projective clustering—a unified geometric covering and clustering model that seeks $k$ $r$-dimensional affine subspaces in $mathbb{R}^d$ to minimize the sum of squared distances from $n$ input points to their nearest subspace. Methodologically, we employ parameterized complexity analysis, ETH-based reductions, computational geometry constructions, and affine optimization techniques. Our contributions are threefold: (i) We establish the first tight parameterized hardness results—proving Line Clustering ($r=1,d=2$) is W[1]-hard and Hyperplane Cover ($r=d-1$) is W[2]-hard—and show, under ETH, no $n^{o(k)}$ algorithm exists for the former. (ii) We derive a tight time complexity bound of $n^{O(dk(r+1))}$, matching a new lower bound and generalizing classical cases including $k$-means ($r=0$) and Line Cover ($r=1,d=2$). (iii) We present the first optimal exponential-time algorithm for projective clustering, unifying techniques across parameterized algorithms, fine-grained complexity, and geometric optimization.
📝 Abstract
We study extensions of the classic emph{Line Cover} problem, which asks whether a set of $n$ points in the plane can be covered using $k$ lines. Line Cover is known to be NP-hard, and we focus on two natural generalizations. The first is extbf{Line Clustering}, where the goal is to find $k$ lines minimizing the sum of squared distances from the input points to their nearest line. The second is extbf{Hyperplane Cover}, which asks whether $n$ points in $mathbb{R}^d$ can be covered by $k$ hyperplanes.
We also study the more general extbf{Projective Clustering} problem, which unifies both settings and has applications in machine learning, data analysis, and computational geometry. In this problem, one seeks $k$ affine subspaces of dimension $r$ that minimize the sum of squared distances from the given points in $mathbb{R}^d$ to the nearest subspace.
Our results reveal notable differences in the parameterized complexity of these problems. While Line Cover is fixed-parameter tractable when parameterized by $k$, we show that Line Clustering is W[1]-hard with respect to $k$ and does not admit an algorithm with running time $n^{o(k)}$ unless the Exponential Time Hypothesis fails. Hyperplane Cover is NP-hard even for $d=2$, and prior work of Langerman and Morin [Discrete & Computational Geometry, 2005] showed that it is fixed-parameter tractable when parameterized by both $k$ and $d$. We complement this by proving that Hyperplane Cover is W[2]-hard when parameterized by $k$ alone.
Finally, we present an algorithm for Projective Clustering running in $n^{O(dk(r+1))}$ time. This bound matches our lower bound for Line Clustering and generalizes the classic algorithm for $k$-Means Clustering ($r=0$) by Inaba, Katoh, and Imai [SoCG 1994].