Adaptive Multidimensional Quadrature on Multi-GPU Systems

📅 2025-11-03
📈 Citations: 0
Influential: 0
📄 PDF

career value

238K/year
🤖 AI Summary
To address load imbalance and poor convergence robustness in high-dimensional adaptive numerical integration on multi-GPU systems, this paper proposes a decentralized distributed algorithm. The method employs hierarchical domain decomposition and local error-driven recursive subdivision, enabling independent adaptive partitioning on each GPU. A cyclic polling-based dynamic load redistribution mechanism is designed, leveraging non-blocking CUDA-aware MPI for low-overhead inter-GPU communication—without requiring global synchronization or centralized scheduling. Experiments on typical 10–50 dimensional integral problems demonstrate that the proposed approach achieves 1.8–3.2× higher computational efficiency compared to state-of-the-art GPU-accelerated integration libraries (e.g., Cuba-GPU, GpuQUAD). Moreover, it exhibits significantly enhanced robustness against degradation in integrand regularity and variations in target accuracy.

Technology Category

Application Category

📝 Abstract
We introduce a distributed adaptive quadrature method that formulates multidimensional integration as a hierarchical domain decomposition problem on multi-GPU architectures. The integration domain is recursively partitioned into subdomains whose refinement is guided by local error estimators. Each subdomain evolves independently on a GPU, which exposes a significant load imbalance as the adaptive process progresses. To address this challenge, we introduce a decentralised load redistribution schemes based on a cyclic round-robin policy. This strategy dynamically rebalance subdomains across devices through non-blocking, CUDA-aware MPI communication that overlaps with computation. The proposed strategy has two main advantages compared to a state-of-the-art GPU-tailored package: higher efficiency in high dimensions; and improved robustness w.r.t the integrand regularity and the target accuracy.
Problem

Research questions and friction points this paper is trying to address.

Develops adaptive quadrature for multidimensional integration on multi-GPU systems
Addresses load imbalance via decentralized redistribution during domain decomposition
Improves efficiency and robustness in high-dimensional integration problems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical domain decomposition for multidimensional integration
Decentralized load balancing using cyclic round-robin policy
Non-blocking CUDA-aware MPI communication overlapping computation
🔎 Similar Papers
No similar papers found.