🤖 AI Summary
This work proposes SPGen, a novel model for accurately predicting human scanpaths when viewing paintings. Built upon a fully convolutional network, SPGen integrates a differentiable fixation selection mechanism and a learnable Gaussian prior to capture natural viewing preferences. It is the first approach to jointly incorporate unsupervised domain adaptation and a stochastic noise sampler into scanpath generation for artworks, effectively bridging the domain gap between natural images and artistic paintings without requiring labeled data in the target domain. The method significantly enhances both the realism and diversity of generated scanpaths, outperforming existing approaches across multiple evaluation metrics. By doing so, SPGen provides an effective computational tool for advancing research in art perception and supporting digital heritage applications.
📝 Abstract
Understanding human visual attention is key to preserving cultural heritage We introduce SPGen a novel deep learning model to predict scanpaths the sequence of eye movementswhen viewers observe paintings.
Our architecture uses a Fully Convolutional Neural Network FCNN with differentiable fixation selection and learnable Gaussian priors to simulate natural viewing biases To address the domain gap between photographs and artworks we employ unsupervised domain adaptation via a gradient reversal layer allowing the model to transfer knowledge from natural scenes to paintings Furthermore a random noise sampler models the inherent stochasticity of eyetracking data.
Extensive testing shows SPGen outperforms existing methods offering a powerful tool to analyze gaze behavior and advance the preservation and appreciation of artistic treasures.