🤖 AI Summary
This study addresses the critical influence of subjective safety perception in urban street environments on residents’ willingness to cycle, a factor inadequately captured by existing computational models due to their limited incorporation of human visual attention mechanisms. To bridge this gap, the authors propose an eye-tracking-guided perceptual cycling safety prediction framework (EG-PCS), which uniquely leverages eye-tracking data as a supervisory signal to guide a vision transformer in aligning its attention with human gaze behavior during pairwise comparison learning. The proposed method not only achieves state-of-the-art performance in safety perception ranking but also significantly enhances the alignment between model-generated attention maps and human visual attention, thereby producing more interpretable and cognitively plausible predictions.
📝 Abstract
Cycling delivers significant public-health and environmental benefits, yet its uptake in cities is often limited by perceived safety. When street environments appear unsafe, individuals are less likely to cycle, making perception a key barrier to adoption. Recent work has shown that pairwise comparisons of street-view images provide a scalable way to learn subjective safety judgments. However, existing approaches do not explicitly model human visual attention, which plays a central role in how humans perceive safety. We propose an Eye-Tracking-Guided Perceived Cycling Safety framework (EG-PCS) that integrates gaze data into a pairwise learning pipeline based on vision transformers. By supervising the model's attention mechanism with eye-tracking signals, we encourage alignment between learned attention maps and human fixation patterns. Experiments show that gaze-guided models achieve similar ranking performance compared to state-of-the-art approaches while producing attention maps that more accurately reflect human visual attention behavior. Our results demonstrate that incorporating eye-tracking information enhances both predictive accuracy and interpretability in perception-based urban analytics.