π€ AI Summary
This work addresses the challenge of video saliency prediction by introducing the first large-scale, openly licensed dataset comprising 2,000 diverse videos, accompanied by gaze data collected via crowdsourced mouse trajectories from over 5,000 observers. The authors organized a public challenge in which participants evaluated their methods on 800 test videos using standard saliency metrics, with rigorous code review enforced to ensure reproducibility. The competition attracted more than 20 teams, of which seven successfully passed the final validation. All data, code, and results have been made publicly available, establishing a robust benchmark and foundational infrastructure to advance research in video saliency prediction.
π Abstract
This paper presents an overview of the NTIRE 2026 Challenge on Video Saliency Prediction. The goal of the challenge participants was to develop automatic saliency map prediction methods for the provided video sequences. The novel dataset of 2,000 diverse videos with an open license was prepared for this challenge. The fixations and corresponding saliency maps were collected using crowdsourced mouse tracking and contain viewing data from over 5,000 assessors. Evaluation was performed on a subset of 800 test videos using generally accepted quality metrics. The challenge attracted over 20 teams making submissions, and 7 teams passed the final phase with code review. All data used in this challenge is made publicly available - https://github.com/msu-video-group/NTIRE26_Saliency_Prediction.