🤖 AI Summary
Scarce real-world eye-tracking data and poor cross-device generalizability hinder deep learning–based gaze estimation. While synthetic data offer a potential remedy, conventional photorealistic rendering is computationally expensive and inefficient. To address this, we propose LEyes—a lightweight synthetic framework introducing a novel dynamic synthesis paradigm based on a simplified analytical generator. LEyes requires no 3D modeling or manual annotation, enabling real-time generation of device-specific, simplified eye images optimized exclusively for pupil and corneal reflection (Pupil&CR) detection. Our method integrates a lightweight image synthesizer, an HRNet variant for keypoint detection, a domain-adaptive training strategy, and an end-to-end gaze estimation pipeline. Experiments demonstrate that models trained on LEyes-synthesized data achieve state-of-the-art Pupil&CR localization accuracy across multiple heterogeneous datasets. Moreover, when deployed on low-cost hardware, LEyes-based tracking outperforms commercial eye-trackers in accuracy and robustness.
📝 Abstract
Deep learning methods have significantly advanced the field of gaze estimation, yet the development of these algorithms is often hindered by a lack of appropriate publicly accessible training datasets. Moreover, models trained on the few available datasets often fail to generalize to new datasets due to both discrepancies in hardware and biological diversity among subjects. To mitigate these challenges, the research community has frequently turned to synthetic datasets, although this approach also has drawbacks, such as the computational resource and labor-intensive nature of creating photorealistic representations of eye images to be used as training data. In response, we introduce “Light Eyes” (LEyes), a novel framework that diverges from traditional photorealistic methods by utilizing simple synthetic image generators to train neural networks for detecting key image features like pupils and corneal reflections, diverging from traditional photorealistic approaches. LEyes facilitates the generation of synthetic data on the fly that is adaptable to any recording device and enhances the efficiency of training neural networks for a wide range of gaze-estimation tasks. Presented evaluations show that LEyes, in many cases, outperforms existing methods in accurately identifying and localizing pupils and corneal reflections across diverse datasets. Additionally, models trained using LEyes data outperform standard eye trackers while employing more cost-effective hardware, offering a promising avenue to overcome the current limitations in gaze estimation technology.