🤖 AI Summary
In cloud-native environments, FPGA orchestration faces fundamental challenges—including lack of hardware virtualization, weak resource isolation, and non-preemptible workloads—resulting in poor scalability and elasticity. To address these, we propose a full-stack FPGA-aware orchestration engine: (1) a lightweight FPGA virtualization sandbox enabling hardware-enforced strong isolation; (2) an FPGA state management mechanism supporting preemption and checkpointing; and (3) a CRI/OCI-compliant FPGA-aware container runtime and scheduler. Our system maintains full OpenCL ecosystem compatibility: only 3.4% code modification is required to port 23 existing applications. OCI images shrink by 28.7×, runtime overhead remains at just 7.4%, and extensive evaluation on large-scale clusters demonstrates superior scalability and fault tolerance.
📝 Abstract
The adoption of FPGAs in cloud-native environments is facing impediments due to FPGA limitations and CPU-oriented design of orchestrators, as they lack virtualization, isolation, and preemption support for FPGAs. Consequently, cloud providers offer no orchestration services for FPGAs, leading to low scalability, flexibility, and resiliency.
This paper presents Funky, a full-stack FPGA-aware orchestration engine for cloud-native applications. Funky offers primary orchestration services for FPGA workloads to achieve high performance, utilization, scalability, and fault tolerance, accomplished by three contributions: (1) FPGA virtualization for lightweight sandboxes, (2) FPGA state management enabling task preemption and checkpointing, and (3) FPGA-aware orchestration components following the industry-standard CRI/OCI specifications.
We implement and evaluate Funky using four x86 servers with Alveo U50 FPGA cards. Our evaluation highlights that Funky allows us to port 23 OpenCL applications from the Xilinx Vitis and Rosetta benchmark suites by modifying 3.4% of the source code while keeping the OCI image sizes 28.7 times smaller than AMD's FPGA-accessible Docker containers. In addition, Funky incurs only 7.4% performance overheads compared to native execution, while providing virtualization support with strong hypervisor-enforced isolation and cloud-native orchestration for a set of distributed FPGAs. Lastly, we evaluate Funky's orchestration services in a large-scale cluster using Google production traces, showing its scalability, fault tolerance, and scheduling efficiency.