Faster Vertex Cover Algorithms on GPUs with Component-Aware Parallel Branching

📅 2025-12-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing GPU-based approaches for the Minimum/Bounded Vertex Cover problem suffer from redundant computation and load imbalance due to their inability to dynamically identify independent connected components arising from graph splitting, while high memory overhead severely limits concurrency. Method: This work presents the first load-balanced parallelization of non-tail-recursive branching on GPUs, introducing a component-aware branching mechanism and a descendant-node post-processing strategy to eliminate duplicate subproblem solving. It integrates connected-component detection, graph reduction, induced-subgraph construction, and optimized work-queue management. Results: Experiments demonstrate over 2000× speedup versus the state-of-the-art GPU method; solution time for complex graphs drops from >6 hours to several seconds. Memory consumption is significantly reduced, enabling substantially higher worker concurrency.

Technology Category

Application Category

📝 Abstract
Algorithms for finding minimum or bounded vertex covers in graphs use a branch-and-reduce strategy, which involves exploring a highly imbalanced search tree. Prior GPU solutions assign different thread blocks to different sub-trees, while using a shared worklist to balance the load. However, these prior solutions do not scale to large and complex graphs because their unawareness of when the graph splits into components causes them to solve these components redundantly. Moreover, their high memory footprint limits the number of workers that can execute concurrently. We propose a novel GPU solution for vertex cover problems that detects when a graph splits into components and branches on the components independently. Although the need to aggregate the solutions of different components introduces non-tail-recursive branches which interfere with load balancing, we overcome this challenge by delegating the post-processing to the last descendant of each branch. We also reduce the memory footprint by reducing the graph and inducing a subgraph before exploring the search tree. Our solution substantially outperforms the state-of-the-art GPU solution, finishing in seconds when the state-of-the-art solution exceeds 6 hours. To the best of our knowledge, our work is the first to parallelize non-tail-recursive branching patterns on GPUs in a load balanced manner.
Problem

Research questions and friction points this paper is trying to address.

Addresses scalability issues in GPU vertex cover algorithms for large graphs.
Reduces redundant computation by detecting and branching on graph components independently.
Minimizes memory footprint to enable more concurrent GPU workers efficiently.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Detects graph splits into components for independent branching
Delegates post-processing to last descendant to balance load
Reduces memory by graph reduction and subgraph induction before exploration
🔎 Similar Papers
No similar papers found.
H
Hussein Amro
Department of Computer Science at the American University of Beirut
B
Basel Fakhri
Department of Computer Science at the American University of Beirut
A
Amer E. Mouawad
Department of Computer Science at the American University of Beirut and David R. Cheriton School of Computer Science at the University of Waterloo
Izzat El Hajj
Izzat El Hajj
American University of Beirut
Parallel ComputingGPU ComputingMemory SystemsCompilers