A High Performance GPU CountSketch Implementation and Its Application to Multisketching and Least Squares Problems

📅 2025-08-19

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

CountSketch lacks high-performance GPU implementations and remains underutilized in multisketching and least-squares solving. Method: This paper proposes a GPU-optimized CountSketch algorithm integrating multisketching and parallel memory access optimizations to accelerate randomized dimensionality reduction. Furthermore, it develops a numerically stable least-squares solver built upon CountSketch, supporting high-fidelity sketch aggregation and residual correction. Contributions/Results: Experiments demonstrate that the proposed method achieves a 77% speedup over the normal equations on standard least-squares problems, while significantly improving numerical stability: the relative residual error is rigorously bounded at 𝒪(1). The framework establishes a new, efficient, and robust GPU-accelerated paradigm for large-scale linear algebra computations.

Technology Category

Application Category

📝 Abstract

Random sketching is a dimensionality reduction technique that approximately preserves norms and singular values up to some $O(1)$ distortion factor with high probability. The most popular sketches in literature are the Gaussian sketch and the subsampled randomized Hadamard transform, while the CountSketch has lower complexity. Combining two sketches, known as multisketching, offers an inexpensive means of quickly reducing the dimension of a matrix by combining a CountSketch and Gaussian sketch. However, there has been little investigation into high performance CountSketch implementations. In this work, we develop an efficient GPU implementation of the CountSketch, and demonstrate the performance of multisketching using this technique. We also demonstrate the potential for using this implementation within a multisketched least squares solver that is up to $77%$ faster than the normal equations with significantly better numerical stability, at the cost of an $O(1)$ multiplicative factor introduced into the relative residual norm.

Problem

Research questions and friction points this paper is trying to address.

Develops high performance GPU CountSketch implementation

Applies CountSketch to multisketching dimensionality reduction technique

Solves least squares problems with improved speed stability

Innovation

Methods, ideas, or system contributions that make the work stand out.

GPU-optimized CountSketch implementation

Combined CountSketch with Gaussian sketching

Multisketched least squares solver faster

🔎 Similar Papers

No similar papers found.