Toward Heterogeneous, Distributed, and Energy-Efficient Computing with SYCL

📅 2025-05-09

📈 Citations: 0

✨ Influential: 0

career value

232K/year

🤖 AI Summary

To address the challenges of programming complexity for GPU/FPGA accelerators, high cross-node data movement overhead, and difficulty in energy-efficiency optimization in heterogeneous distributed HPC systems, this paper proposes a high-level programming framework based on SYCL 2020. Its core contributions are: (1) Celerity—a novel distributed task dispatching mechanism supporting standard SYCL semantics, enabling unified scheduling and load balancing across CPUs, GPUs, and FPGAs; and (2) SYnergy—a power-modeling-driven co-optimization extension that integrates feedback control with multi-level memory-aware task mapping to achieve energy-aware execution. The framework is fully compatible with mainstream SYCL implementations and requires no modifications to existing SYCL code. Experimental evaluation on heterogeneous clusters demonstrates up to a 2.3× improvement in energy efficiency and a 1.8× speedup in task dispatching throughput.

Technology Category

Application Category

📝 Abstract

Programming modern high-performance computing systems is challenging due to the need to efficiently program GPUs and accelerators and to handle data movement between nodes. The C++ language has been continuously enhanced in recent years with features that greatly increase productivity. In particular, the C++-based SYCL standard provides a powerful programming model for heterogeneous systems that can target a wide range of devices, including multicore CPUs, GPUs, FPGAs, and accelerators, while providing high-level abstractions. This presentation introduces our research efforts to design a SYCL-based high-level programming interface that provides advanced techniques such as task distribution and energy optimization. The key insight is that SYCL semantics can be easily extended to provide advanced features for easy integration into existing SYCL programs. In particular, we will highlight two SYCL extensions that are designed to deal with workload distribution on accelerator clusters (Celerity) and with energy-efficient computing (SYnergy).

Problem

Research questions and friction points this paper is trying to address.

Efficient programming of GPUs and accelerators in HPC

Handling data movement between distributed computing nodes

Achieving energy-efficient computing in heterogeneous systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

SYCL-based high-level programming interface

Task distribution and energy optimization techniques

Extensions for workload and energy efficiency

🔎 Similar Papers

Lessons Learned Migrating CUDA to SYCL: A HEP Case Study with ROOT RDataFrame