Cross-Dataset Gaze Estimation by Evidential Inter-intra Fusion

📅 2024-09-07
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Cross-dataset gaze estimation suffers from poor generalization and degraded accuracy due to inter-domain distribution shifts. To address this, we propose the Evidential Intra- and Inter-dataset Fusion (EIF) framework—the first to jointly model intra-dataset local subspace regression and cross-dataset feature integration. EIF employs the Normal-Inverse-Gamma (NIG) distribution to enable joint gaze prediction and uncertainty quantification. We further introduce a multi-branch parallel architecture, adaptive subspace partitioning, and a Mixture of NIG (MoNIG) fusion mechanism to enhance robustness and calibration. Extensive experiments across multiple source domains and unseen target domains demonstrate that EIF significantly improves both prediction accuracy and uncertainty calibration, achieving strong cross-domain generalization without sacrificing in-domain performance. By unifying evidence-based learning with structured feature fusion, EIF establishes a novel, interpretable, and trustworthy paradigm for robust gaze estimation.

Technology Category

Application Category

📝 Abstract
Achieving accurate and reliable gaze predictions in complex and diverse environments remains challenging. Fortunately, it is straightforward to access diverse gaze datasets in real-world applications. We discover that training these datasets jointly can significantly improve the generalization of gaze estimation, which is overlooked in previous works. However, due to the inherent distribution shift across different datasets, simply mixing multiple dataset decreases the performance in the original domain despite gaining better generalization abilities. To address the problem of ``cross-dataset gaze estimation'', we propose a novel Evidential Inter-intra Fusion EIF framework, for training a cross-dataset model that performs well across all source and unseen domains. Specifically, we build independent single-dataset branches for various datasets where the data space is partitioned into overlapping subspaces within each dataset for local regression, and further create a cross-dataset branch to integrate the generalizable features from single-dataset branches. Furthermore, evidential regressors based on the Normal and Inverse-Gamma (NIG) distribution are designed to additionally provide uncertainty estimation apart from predicting gaze. Building upon this foundation, our proposed framework achieves both intra-evidential fusion among multiple local regressors within each dataset and inter-evidential fusion among multiple branches by Mixture extbfof Normal Inverse-Gamma (MoNIG distribution. Experiments demonstrate that our method consistently achieves notable improvements in both source domains and unseen domains.
Problem

Research questions and friction points this paper is trying to address.

Gaze Prediction
Multi-dataset Generalization
Model Accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-Internal Fusion (EIF)
Gaze Direction Prediction
Data Integration
🔎 Similar Papers
No similar papers found.