🤖 AI Summary
This work addresses the significant performance disparities of appearance-based gaze estimation models across demographic groups defined by ethnicity and gender, a critical issue that has lacked systematic fairness evaluation. We present the first fairness benchmark for gaze estimation, conducting an empirical analysis of state-of-the-art models using established fairness metrics and evaluating the efficacy of existing debiasing methods. Our study reveals substantial inter-group performance gaps in current models and highlights the limitations of prevailing debiasing strategies. To foster future research in fairness-aware gaze estimation, we publicly release our annotated dataset, implementation code, and pretrained models.
📝 Abstract
While appearance-based gaze estimation has achieved significant improvements in accuracy and domain adaptation, the fairness of these systems across different demographic groups remains largely unexplored. To date, there is no comprehensive benchmark quantifying algorithmic bias in gaze estimation. This paper presents the first extensive evaluation of fairness in appearance-based gaze estimation, focusing on ethnicity and gender attributes. We establish a fairness baseline by analyzing state-of-the-art models using standard fairness metrics, revealing significant performance disparities. Furthermore, we evaluate the effectiveness of existing bias mitigation strategies when applied to the gaze domain and show that their fairness contributions are limited. We summarize key insights and open issues. Overall, our work calls for research into developing robust, equitable gaze estimators. To support future research and reproducibility, we publicly release our annotations, code, and trained models at: github.com/akgulburak/gaze-estimation-fairness