Modeling Visual Memorability Assessment with Autoencoders Reveals Characteristics of Memorable Images

📅 2024-10-19
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Image memorability—the probability that an image is remembered after a single exposure—lacks a well-established computational model, and its key visual determinants remain unclear. Method: We propose a single-epoch VGG16-based autoencoder trained to simulate human instantaneous memory; for the first time, we directly model reconstruction error as a memorability predictor. We further employ Integrated Gradients for interpretability, identifying semantically salient regions and high-contrast textures as low-level memorability cues. Contribution/Results: We discover that inter-class discriminability of latent representations is a stronger predictor of memorability than conventional features (significantly positive correlation). Reconstruction error exhibits a significant negative correlation with human memorability scores (p < 0.001), and an MLP regressor achieves a Spearman correlation of 0.72. This work establishes a novel, interpretable computational paradigm for studying the neural and perceptual foundations of visual memory.

Technology Category

Application Category

📝 Abstract
Image memorability refers to the phenomenon where certain images are more likely to be remembered than others. It is a quantifiable and intrinsic image attribute, defined as the likelihood of an image being remembered upon a single exposure. Despite advances in understanding human visual perception and memory, it is unclear what features contribute to an image's memorability. To address this question, we propose a deep learning-based computational modeling approach. We employ an autoencoder-based approach built on VGG16 convolutional neural networks (CNNs) to learn latent representations of images. The model is trained in a single-epoch setting, mirroring human memory experiments that assess recall after a single exposure. We examine the relationship between autoencoder reconstruction error and memorability, analyze the distinctiveness of latent space representations, and develop a multi-layer perceptron (MLP) model for memorability prediction. Additionally, we perform interpretability analysis using Integrated Gradients (IG) to identify the key visual characteristics that contribute to memorability. Our results demonstrate a significant correlation between the images' memorability score and the autoencoder's reconstruction error, as well as the robust predictive performance of its latent representations. Distinctiveness in these representations correlated significantly with memorability. Additionally, certain visual characteristics were identified as features contributing to image memorability in our model. These findings suggest that autoencoder-based representations capture fundamental aspects of image memorability, providing new insights into the computational modeling of human visual memory.
Problem

Research questions and friction points this paper is trying to address.

Identify visual features affecting image memorability
Develop autoencoder model for memorability prediction
Analyze latent representations for memorability characteristics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Autoencoder-based approach using VGG16 CNNs
Single-epoch training mimics human memory
Integrated Gradients for interpretability analysis
🔎 Similar Papers
No similar papers found.
E
Elham Bagheri
Vector Institute for Artificial Intelligence, Schwartz Reisman Innovation Campus, Toronto, ON, Canada; Department of Computer Science, Western University, London, ON, Canada
Yalda Mohsenzadeh
Yalda Mohsenzadeh
Western University & Vector Institute for Artificial Intelligence
Human-Machine IntelligenceComputational NeuroscienceArtificial IntelligenceHuman Brain ImagingDeep Learning