GPU-Accelerated Distributed QAOA on Large-scale HPC Ecosystems

📅 2025-06-12

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

To address the scalability limitations of high-dimensional, dense combinatorial optimization problems, this paper proposes DQAOA—a scalable, distributed implementation framework of the Quantum Approximate Optimization Algorithm (QAOA) tailored for the Frontier exascale supercomputer. Methodologically, we introduce a novel GPU-accelerated heterogeneous parallel architecture integrating MPI-based inter-node communication, the QFw quantum programming framework, and fine-grained CPU/GPU co-scheduling; we further design a problem decomposition strategy and a quantum-classical workload dynamic scheduling mechanism. Experimental evaluation on Frontier demonstrates up to 10× speedup over CPU-only simulation and enables solving problems with up to one thousand variables. To our knowledge, this work represents the first large-scale deployment of DQAOA on an exascale platform with over 100 million computing cores, marking a significant step toward the practical integration of hybrid quantum-classical algorithms into the HPC ecosystem.

Technology Category

Application Category

📝 Abstract

Quantum computing holds great potential to accelerate the process of solving complex combinatorial optimization problems. The Distributed Quantum Approximate Optimization Algorithm (DQAOA) addresses high-dimensional, dense problems using current quantum computing techniques and high-performance computing (HPC) systems. In this work, we improve the scalability and efficiency of DQAOA through advanced problem decomposition and parallel execution using message passing on the Frontier CPU/GPU supercomputer. Our approach ensures efficient quantum-classical workload management by distributing large problem instances across classical and quantum resources. Experimental results demonstrate that enhanced decomposition strategies and GPU-accelerated quantum simulations significantly improve DQAOA's performance, achieving up to 10x speedup over CPU-based simulations. This advancement enables better scalability for large problem instances, supporting the practical deployment of GPU systems for hybrid quantum-classical applications. We also highlight ongoing integration efforts using the Quantum Framework (QFw) to support future HPC-quantum computing systems.

Problem

Research questions and friction points this paper is trying to address.

Enhancing scalability of Distributed QAOA on HPC systems

Optimizing quantum-classical workload management for large problems

Accelerating quantum simulations using GPU-based parallel execution

Innovation

Methods, ideas, or system contributions that make the work stand out.

GPU-accelerated quantum simulations for speedup

Distributed problem decomposition on HPC systems

Hybrid quantum-classical workload management

🔎 Similar Papers

Enhancing Large-Scale AI Training Efficiency: The C4 Solution for Real-Time Anomaly Detection and Communication Optimization