GPU-Parallelizable Randomized Sketch-and-Precondition for Linear Regression using Sparse Sign Sketches

📅 2025-06-03

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This work addresses large-scale linear regression by proposing a GPU-accelerated Sketch-and-Precondition framework driven by sparse sign sketches. To overcome bottlenecks in sketch generation and application within heterogeneous parallel environments, we design a lightweight sparse sign sketching algorithm based on rejection sampling, significantly improving GPU parallel efficiency. We further conduct the first systematic evaluation of this paradigm’s scalability and practicality on single- and multi-GPU platforms. The method integrates sparse random projections, sign-based sketching, GPU acceleration, and preconditioned conjugate gradient solvers—achieving numerical robustness while substantially reducing communication overhead. Experimental results demonstrate superior performance over highly optimized CPU-based approaches and confirm its engineering viability for integration into black-box least-squares solvers.

Technology Category

Application Category

📝 Abstract

A litany of theoretical and numerical results have established the sketch-and-precondition paradigm as a powerful approach to solving large linear regression problems in standard computing environments. Perhaps surprisingly, much less work has been done on understanding how sketch-and-precondition performs on graphics processing unit (GPU) systems. We address this gap by benchmarking an implementation of sketch-and-precondition based on sparse sign-sketches on single and multi-GPU systems. In doing so, we describe a novel, easily parallelized, rejection-sampling based method for generating sparse sign sketches. Our approach, which is particularly well-suited for GPUs, is easily adapted to a variety of computing environments. Taken as a whole, our numerical experiments indicate that sketch-and-precondition with sparse sign sketches is particularly well-suited for GPUs, and may be suitable for use in black-box least-squares solvers.

Problem

Research questions and friction points this paper is trying to address.

Evaluating sketch-and-precondition performance on GPU systems

Developing parallelizable sparse sign sketch generation method

Assessing suitability for black-box least-squares solvers

Innovation

Methods, ideas, or system contributions that make the work stand out.

GPU-parallelizable sparse sign sketches

Rejection-sampling for sketch generation

Adaptable to various computing environments

🔎 Similar Papers

No similar papers found.