NVBleed: Covert and Side-Channel Attacks on NVIDIA Multi-GPU Interconnect

📅 2025-03-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work identifies, for the first time, that NVIDIA’s NVLink interconnect in multi-GPU systems can be exploited to enable cross-virtual-machine (VM) covert information leakage—introducing a novel side-channel and covert communication attack surface in cloud environments. Method: Leveraging NVLink protocol reverse-engineering, timing contention modeling, hardware performance counter monitoring (e.g., NVLink utilization), and machine learning–based classification (evaluated via F1-score), we construct a high-bandwidth covert channel (>70 Kbps, BER = 4.78%) and an end-to-end attack prototype. Contribution/Results: We demonstrate practical cross-VM isolation bypass in a real-world cloud platform (Google Cloud Platform). Our attacks achieve high-accuracy inference of HPC/deep learning application fingerprints (F1 = 97.78%), Blender 3D rendering character identification (F1 = 91.56%), and cross-VM side-channel exploitation (F1 > 88%). This is the first empirical validation of NVLink’s security risks in multi-tenant cloud GPU deployments, providing critical insights for secure GPU interconnect design.

Technology Category

Application Category

📝 Abstract
Multi-GPU systems are becoming increasingly important in highperformance computing (HPC) and cloud infrastructure, providing acceleration for data-intensive applications, including machine learning workloads. These systems consist of multiple GPUs interconnected through high-speed networking links such as NVIDIA's NVLink. In this work, we explore whether the interconnect on such systems can offer a novel source of leakage, enabling new forms of covert and side-channel attacks. Specifically, we reverse engineer the operations of NVlink and identify two primary sources of leakage: timing variations due to contention and accessible performance counters that disclose communication patterns. The leakage is visible remotely and even across VM instances in the cloud, enabling potentially dangerous attacks. Building on these observations, we develop two types of covert-channel attacks across two GPUs, achieving a bandwidth of over 70 Kbps with an error rate of 4.78% for the contention channel. We develop two end-to-end crossGPU side-channel attacks: application fingerprinting (including 18 high-performance computing and deep learning applications) and 3D graphics character identification within Blender, a multi-GPU rendering application. These attacks are highly effective, achieving F1 scores of up to 97.78% and 91.56%, respectively. We also discover that leakage surprisingly occurs across Virtual Machines on the Google Cloud Platform (GCP) and demonstrate a side-channel attack on Blender, achieving F1 scores exceeding 88%. We also explore potential defenses such as managing access to counters and reducing the resolution of the clock to mitigate the two sources of leakage.
Problem

Research questions and friction points this paper is trying to address.

Explores covert and side-channel attacks on NVIDIA's NVLink interconnect
Identifies timing and performance counter leaks across GPUs and VMs
Demonstrates effective attacks on HPC and deep learning applications
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reverse engineer NVLink for leakage sources
Develop covert-channel attacks across GPUs
Demonstrate cross-VM side-channel attacks
🔎 Similar Papers
No similar papers found.