๐ค AI Summary
Although dynamic foveated rendering (DFR) systems do not directly output eye-tracking data, variations in GPU workload induced by user gaze can leakๆณจ่งไฝ็ฝฎ through rendering performance metrics such as frame rate, presenting a novel side-channel privacy threat. This work is the first to expose this vulnerability and introduces a high-precision attack method that requires no access to eye-tracking APIs: by sweeping imperceptible, high-computational-overhead objects (HCOs) across the visual field, the attacker infers gaze coordinates from performance fluctuations caused when these objects overlap with the foveal region. Experiments on Meta Quest Pro, Varjo XR-4, and desktop platforms achieve average prediction errors of 1.1โ4.4 degrees, approaching the accuracy of commercial eye trackers. The study also proposes efficient supervised and unsupervised defense mechanisms capable of reliably detecting such attacks within seconds with an F1 score of 0.99.
๐ Abstract
While eye tracking provides valuable capabilities for virtual reality, such as gaze interaction and dynamic foveated rendering (DFR), eye-tracking data can inadvertently reveal sensitive user information if not properly protected. Current protections, such as adding permission prompts or gatekeeping gaze data, are insufficient on DFR-enabled systems because gaze data is used internally to drive DFR. When DFR is implemented, objects in the fovea (i.e., immediate gaze area) incur a higher GPU workload than those in the periphery. This gaze-contingent workload creates a novel side channel, which can be leveraged to reconstruct gaze positions. Specifically, we design a novel attack that sweeps imperceptible high-cost objects (HCOs) across the user's field of view and logs rendering performance metrics (e.g., frame rate or frame time) commonly exposed through standard game engines. Then, we correlate variation in these metrics (caused by HCO-foveal overlap) with the known HCOs' positions to infer gaze coordinates directly without using eye-tracking APIs. Our experimental results show that mean gaze prediction errors (1.1-4.4 degrees) across the Meta Quest Pro, Varjo XR-4, and desktop platforms are comparable to typical eye-tracker accuracy. We demonstrate that the attack generalizes across various hardware platforms, standard game engines, and foveated rendering pipelines. Finally, we design defense mechanisms based on supervised and unsupervised detectors that can flag the attack reliably (F1 of 0.99) over short time windows.