🤖 AI Summary
Personalized saliency map (PSM) prediction under few-shot settings (≤5 images per user) is challenging due to severe scarcity of individual eye-tracking data and substantial inter-user variability in visual attention patterns.
Method: We propose an image-diversity-driven eye-tracking data selection strategy coupled with a tensor regression modeling framework. Our approach explicitly captures the multi-dimensional structure of eye-movement responses—across users, images, and spatiotemporal locations—enabling structured knowledge transfer across users in few-shot scenarios. Diversity-aware support-set sampling enhances representativeness, while tensor regression jointly preserves both user-specific preferences and shared attentional priors.
Contribution/Results: To our knowledge, this is the first method enabling effective cross-user structural knowledge transfer for PSM prediction under few-shot constraints. It consistently outperforms state-of-the-art methods on multiple benchmarks, empirically validating the efficacy and generalizability of the synergistic “structural modeling + intelligent sampling” paradigm for few-shot PSM prediction.
📝 Abstract
This paper presents few-shot personalized saliency prediction based on inter-personnel gaze patterns. In contrast to general saliency maps, personalized saliecny maps (PSMs) have been great potential since PSMs indicate the person-specific visual attention useful for obtaining individual visual preferences. The PSM prediction is needed for acquiring the PSMs for unseen images, but its prediction is still a challenging task due to the complexity of individual gaze patterns. Moreover, the eye-tracking data obtained from each person is necessary to construct and predict PSMs, but it is difficult to acquire the massive amounts of such data. One solution for realizing PSM prediction from the limited amount of data is the effective use of eye-tracking data obtained from other persons. To efficiently treat the PSMs of other persons, this paper focuses on the selection of images to acquire eye-tracking data and the preservation of structural information of PSMs of other persons. In the proposed method, such images are selected such that they bring more diverse gaze patterns to persons, and the structural information is preserved by adopting the tensor-based regression method. Experimental results demonstrate that the above two points are beneficial for the few-shot PSM prediction.