Adaptive Sketching Based Construction of H2 Matrices on GPUs

📅 2025-06-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing H²-matrix construction on GPUs suffers from low efficiency and lacks support for variable-length data batching. Method: We propose a linear-complexity bottom-up sketching algorithm and design the first GPU-accelerated, batched sketching and elementwise evaluation framework tailored for variable-length data structures. Our approach supports black-box sketching operators and customizable elementwise evaluation functions, achieving end-to-end GPU acceleration via CUDA kernel optimization, adaptive sketching strategies, and parallel construction of the H² hierarchical structure. Contribution/Results: Experiments show speedups of up to 13× over our optimized CPU implementation, 1000× over H2Opus’s top-down GPU approach, and 660× over ButterflyPACK’s H-matrix sketching. This work marks the first efficient GPU realization of bottom-up sketching-based H²-matrix construction, establishing a new paradigm for scalable kernel matrix compression.

Technology Category

Application Category

📝 Abstract
We develop a novel linear-complexity bottom-up sketching-based algorithm for constructing a $H^2$ matrix, and present its high performance GPU implementation. The construction algorithm requires both a black-box sketching operator and an entry evaluation function. The novelty of our GPU approach centers around the design and implementation of the above two operations in batched mode on GPU with accommodation for variable-size data structures in a batch. The batch algorithms minimize the number of kernel launches and maximize the GPU throughput. When applied to covariance matrices, volume IE matrices and $H^2$ update operations, our proposed GPU implementation achieves up to $13 imes$ speedup over our CPU implementation, and up to $1000 imes$ speedup over an existing GPU implementation of the top-down sketching-based algorithm from the H2Opus library. It also achieves a $660 imes$ speedup over an existing sketching-based $H$ construction algorithm from the ButterflyPACK library. Our work represents the first GPU implementation of the class of bottom-up sketching-based $H^2$ construction algorithms.
Problem

Research questions and friction points this paper is trying to address.

Develops linear-complexity algorithm for H2 matrix construction
Optimizes GPU implementation with batched operations
Achieves significant speedup over CPU and existing GPU methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

Linear-complexity bottom-up sketching-based algorithm
Batched GPU operations for variable-size data
High performance GPU implementation with speedup
🔎 Similar Papers
No similar papers found.