🤖 AI Summary
Mobile gaze tracking suffers significant accuracy degradation under dynamic user postures and device orientations, rendering conventional single-session calibration insufficient. This paper proposes an IMU motion-aware continual calibration framework: it introduces the first integration of IMU-based activity recognition, clustering-driven dynamic recalibration triggering, and replay-based continual learning to enable adaptive modeling of novel motion states while mitigating catastrophic forgetting of previously learned states. The method relies solely on a smartphone’s built-in IMU and a pre-trained visual gaze estimator—requiring no additional hardware. Evaluated on RGBDGaze and our newly collected MotionGaze dataset, it reduces gaze estimation error by 19.9% and 31.7%, respectively. It substantially improves robustness and long-term stability across diverse postures—including sitting, standing, lying, and walking—under real-world motion conditions.
📝 Abstract
Mobile gaze tracking faces a fundamental challenge: maintaining accuracy as users naturally change their postures and device orientations. Traditional calibration approaches, like one-off, fail to adapt to these dynamic conditions, leading to degraded performance over time. We present MAC-Gaze, a Motion-Aware continual Calibration approach that leverages smartphone Inertial measurement unit (IMU) sensors and continual learning techniques to automatically detect changes in user motion states and update the gaze tracking model accordingly. Our system integrates a pre-trained visual gaze estimator and an IMU-based activity recognition model with a clustering-based hybrid decision-making mechanism that triggers recalibration when motion patterns deviate significantly from previously encountered states. To enable accumulative learning of new motion conditions while mitigating catastrophic forgetting, we employ replay-based continual learning, allowing the model to maintain performance across previously encountered motion conditions. We evaluate our system through extensive experiments on the publicly available RGBDGaze dataset and our own 10-hour multimodal MotionGaze dataset (481K+ images, 800K+ IMU readings), encompassing a wide range of postures under various motion conditions including sitting, standing, lying, and walking. Results demonstrate that our method reduces gaze estimation error by 19.9% on RGBDGaze (from 1.73 cm to 1.41 cm) and by 31.7% on MotionGaze (from 2.81 cm to 1.92 cm) compared to traditional calibration approaches. Our framework provides a robust solution for maintaining gaze estimation accuracy in mobile scenarios.