Toward Heterogeneous, Distributed, and Energy-Efficient Computing with SYCL

📅 2025-05-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of programming complexity for GPU/FPGA accelerators, high cross-node data movement overhead, and difficulty in energy-efficiency optimization in heterogeneous distributed HPC systems, this paper proposes a high-level programming framework based on SYCL 2020. Its core contributions are: (1) Celerity—a novel distributed task dispatching mechanism supporting standard SYCL semantics, enabling unified scheduling and load balancing across CPUs, GPUs, and FPGAs; and (2) SYnergy—a power-modeling-driven co-optimization extension that integrates feedback control with multi-level memory-aware task mapping to achieve energy-aware execution. The framework is fully compatible with mainstream SYCL implementations and requires no modifications to existing SYCL code. Experimental evaluation on heterogeneous clusters demonstrates up to a 2.3× improvement in energy efficiency and a 1.8× speedup in task dispatching throughput.

Technology Category

Application Category

📝 Abstract
Programming modern high-performance computing systems is challenging due to the need to efficiently program GPUs and accelerators and to handle data movement between nodes. The C++ language has been continuously enhanced in recent years with features that greatly increase productivity. In particular, the C++-based SYCL standard provides a powerful programming model for heterogeneous systems that can target a wide range of devices, including multicore CPUs, GPUs, FPGAs, and accelerators, while providing high-level abstractions. This presentation introduces our research efforts to design a SYCL-based high-level programming interface that provides advanced techniques such as task distribution and energy optimization. The key insight is that SYCL semantics can be easily extended to provide advanced features for easy integration into existing SYCL programs. In particular, we will highlight two SYCL extensions that are designed to deal with workload distribution on accelerator clusters (Celerity) and with energy-efficient computing (SYnergy).
Problem

Research questions and friction points this paper is trying to address.

Efficient programming of GPUs and accelerators in HPC
Handling data movement between distributed computing nodes
Achieving energy-efficient computing in heterogeneous systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

SYCL-based high-level programming interface
Task distribution and energy optimization techniques
Extensions for workload and energy efficiency
🔎 Similar Papers
No similar papers found.
Biagio Cosenza
Biagio Cosenza
University of Salerno, Italy
high performance computingsoftware optimizationcompilersprogramming modelsGPUs
L
Lorenzo Carpentieri
University of Salerno, Italy
K
Kaijie Fan
University of Salerno, Italy
M
Marco D'Antonio
Queen’s University Belfast, UK
Peter Thoman
Peter Thoman
University of Innsbruck
P
Philip Salzmann
University of Innsbruck, Austria