Raising the Ceiling: Better Empirical Fixation Densities for Saliency Benchmarking

📅 2026-05-05

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

Traditional fixed-bandwidth isotropic Gaussian kernel density estimation (KDE) struggles to meet the demands of sample-specific evaluation of eye fixation density maps. This work proposes a hybrid fixation density estimation method that integrates adaptive-bandwidth KDE—guided by Abramson’s rule—with center bias, uniform distribution, and state-of-the-art saliency models. Notably, it is the first to incorporate semantic-aware components into adaptive KDE and employs leave-one-subject-out cross-validation to optimize parameters per image. By departing from a decades-old paradigm, the approach significantly improves inter-observer consistency across multiple benchmarks: median log-likelihood gains range from 5% to 15%, AUC increases by up to 2 percentage points, and critical failure cases show improvements exceeding 25%.

📝 Abstract

Empirical fixation densities, spatial distributions estimated from human eye-tracking data, are foundational to saliency benchmarking. They directly shape benchmark conclusions, leaderboard rankings, failure case analyses, and scientific claims about human visual behavior. Yet the standard estimation method, fixed-bandwidth isotropic Gaussian KDE, has gone essentially unchanged for decades. This matters now more than ever: as the field shifts toward sample-level evaluation (failure case analysis, inverse benchmarking, per-image model comparison), reliable per-image density estimates become critical. We propose a principled mixture model that combines an adaptive-bandwidth KDE based on Abramson's method, center bias and uniform components, and a state-of-the-art saliency model, to capture different spatial and semantic types of interobserver consistency, and optimize all parameters per image via leave-one-subject-out cross-validation. Our method yields substantially higher interobserver consistency estimates across multiple benchmarks, with median per-image gains of 5-15% in log-likelihood and up to 2 percentage points in AUC. For the most affected images -- precisely those most relevant to failure case analysis -- improvements exceed 25%. We leverage these improved estimates to identify and analyze remaining failure cases of state-of-the-art saliency models, demonstrating that significant headroom for model improvement remains. More broadly, our findings highlight that empirical fixation densities should not be treated as fixed ground truths but as evolving estimates that improve with better methodology.

Problem

Research questions and friction points this paper is trying to address.

fixation density

saliency benchmarking

eye-tracking

interobserver consistency

sample-level evaluation

Innovation

Methods, ideas, or system contributions that make the work stand out.

adaptive-bandwidth KDE

mixture model

fixation density estimation

saliency benchmarking

leave-one-subject-out cross-validation

🔎 Similar Papers

No similar papers found.