🤖 AI Summary
Existing approaches to GPU vulnerability detection rely on translating CUDA programs for CPU execution, which fails to accurately reflect the architectural characteristics of GPUs and consequently hinders the effective discovery of memory safety vulnerabilities. This work proposes the first GPU-native fuzzing framework that directly executes test cases on real GPU hardware, thereby preserving strict behavioral fidelity of the original programs. By integrating GPU-native fuzzing with in-depth CUDA program analysis, the approach systematically exposes fundamental limitations of current testing methodologies. Furthermore, it establishes essential design principles and practical guidelines for high-fidelity, efficient memory safety testing in heterogeneous computing systems.
📝 Abstract
Modern computing is shifting from homogeneous CPU-centric systems to heterogeneous systems with closely integrated CPUs and GPUs. While the CPU software stack has benefited from decades of memory safety hardening, the GPU software stack remains dangerously immature. This discrepancy presents a critical ethical challenge: the world's most advanced AI and scientific workloads are increasingly deployed on vulnerable hardware components. In this paper, we study the key challenges of ensuring memory safety on heterogeneous systems. We show that, while the number of exploitable bugs in heterogeneous systems rises every year, current mitigation methods often rely on unfaithful translations, i.e., converting GPU programs to run on CPUs for testing, which fails to capture the architectural differences between CPUs and GPUs. We argue that the faithfulness of the program behavior is at the core of secure and reliable heterogeneous systems design. To ensure faithfulness, we discuss several design considerations of a GPU-native fuzzing pipeline for CUDA programs.