🤖 AI Summary
Existing driver attention prediction methods suffer from narrow field-of-view (FoV) constraints and limited scene diversity, hindering holistic modeling of omnidirectional attention mechanisms during complex driving maneuvers—such as lane changes, turning, and pedestrian/vehicle interactions. To address this, we introduce PGD-360, the first million-scale 360° panoramic driver gaze dataset, capturing omnidirectional visual attention behaviors of 19 drivers across diverse traffic scenarios. We propose DriverGaze360-Net, a novel multi-task framework jointly optimizing gaze heatmap prediction and semantic segmentation to enable object-level guided wide-FoV attention modeling. Leveraging panoramic image encoding, an auxiliary segmentation head, and a refined gaze annotation pipeline, our method significantly improves localization accuracy in peripheral regions and responsiveness to dynamic objects. It achieves state-of-the-art performance on multiple metrics for panoramic driver gaze prediction.
📝 Abstract
Predicting driver attention is a critical problem for developing explainable autonomous driving systems and understanding driver behavior in mixed human-autonomous vehicle traffic scenarios. Although significant progress has been made through large-scale driver attention datasets and deep learning architectures, existing works are constrained by narrow frontal field-of-view and limited driving diversity. Consequently, they fail to capture the full spatial context of driving environments, especially during lane changes, turns, and interactions involving peripheral objects such as pedestrians or cyclists. In this paper, we introduce DriverGaze360, a large-scale 360$^circ$ field of view driver attention dataset, containing $sim$1 million gaze-labeled frames collected from 19 human drivers, enabling comprehensive omnidirectional modeling of driver gaze behavior. Moreover, our panoramic attention prediction approach, DriverGaze360-Net, jointly learns attention maps and attended objects by employing an auxiliary semantic segmentation head. This improves spatial awareness and attention prediction across wide panoramic inputs. Extensive experiments demonstrate that DriverGaze360-Net achieves state-of-the-art attention prediction performance on multiple metrics on panoramic driving images. Dataset and method available at https://av.dfki.de/drivergaze360.