RAFI -- A Ray/Work Forwarding Infrastructure for Data Parallel Multi-Node/Multi-GPU Computing

📅 2026-05-28

📈 Citations: 0

✨ Influential: 0

career value

234K/year

🤖 AI Summary

This work addresses the lack of efficient support for dynamic migration of work items—such as rays—across GPUs in multi-node, multi-GPU data-parallel computing. The authors propose RaFI, a software framework built on CUDA and MPI, which introduces, for the first time, a unified interface enabling GPU kernels to succinctly forward work items to other GPUs while automatically managing the underlying communication and data transfers. By abstracting away the complexities of coordinated CUDA-MPI programming, RaFI significantly simplifies the development of multi-GPU collaborative applications. Empirical evaluation across several use cases demonstrates that the framework not only eases programming but also maintains high performance and strong scalability.

📝 Abstract

We present RaFI, a CUDA and MPI based software framework that simplifies the task of building GPU-enabled data-parallel software where rays or similar work items need to migrate between different GPUs. RaFI provides a simple interface for CUDA kernels to forward such work items to other GPUs, while under the hood managing all the CUDA and MPI related work required to make this happen. We describe RaFI's motivation and implementation, and show its potential in several example applications.

Problem

Research questions and friction points this paper is trying to address.

data parallel

multi-GPU

ray forwarding

work migration

distributed computing

Innovation

Methods, ideas, or system contributions that make the work stand out.

Ray forwarding

Multi-GPU computing

Data parallelism