🤖 AI Summary
Spatial branch-and-bound (B&B) algorithms for nonconvex global optimization suffer from high computational cost, and GPU acceleration remains underexplored. Method: This paper proposes a deterministic GPU-accelerated global optimization framework. Its core innovation integrates interval arithmetic with mean-value form relaxations within a parallel GPU architecture, augmented by adaptive domain partitioning to tighten lower bound estimates; it further introduces a CUDA Graph–based task scheduler and implements a dual-mode GPU execution mechanism within the MAiNGO solver. Contribution/Results: Experiments demonstrate up to three orders-of-magnitude speedup over CPU-based interval arithmetic. The method consistently outperforms the default McCormick-based solver across multiple benchmark problems, delivering significant improvements in both efficiency and solution accuracy for large-scale nonconvex optimization.
📝 Abstract
Spatial Branch and Bound (B&B) algorithms are widely used for solving nonconvex problems to global optimality, yet they remain computationally expensive. Though some works have been carried out to speed up B&B via CPU parallelization, GPU parallelization is much less explored. In this work, we investigate the design of a spatial B&B algorithm that involves an interval-based GPU-parallel lower bounding solver: The domain of each B&B node is temporarily partitioned into numerous subdomains, then massive GPU parallelism is leveraged to compute interval bounds of the objective function and constraints on each subdomain, using the Mean Value Form. The resulting bounds are tighter than those achieved via regular interval arithmetic without partitioning, but they remain fast to compute. We implement the method into our open-source solver MAiNGO via CUDA in two manners: wrapping all GPU tasks within one kernel function, or distributing the GPU tasks onto a CUDA graph. Numerical experiments show that using more subdomains leads to significantly tighter lower bounds and thus less B&B iterations. Regarding wall clock time, the proposed spatial B&B framework achieves a speedup of three orders of magnitude compared to applying interval arithmetic on the CPU without domain partitioning. Among the two implementations, the one developed with CUDA graph enables higher efficiency. Moreover, in some case studies, the proposed method delivers competitive or better performance compared to MAiNGO's default solver which is based on McCormick relaxations. These results highlight the potential of GPU-accelerated bounding techniques to accelerate B&B algorithms.