Interpretable Image Emotion Recognition: A Domain Adaptation Approach Using Facial Expressions

📅 2020-11-17
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
To address the scarcity of high-quality labeled data and poor generalization of pretrained models in image emotion recognition (IER), this paper proposes a cross-domain IER method tailored for generic images—including non-facial and non-human content. Leveraging facial expression recognition (FER) models, it transfers knowledge via feature-level domain adaptation. We introduce DnCShap, the first explainability framework integrating a “divide-and-conquer” strategy with SHAP to enhance discriminability and attribution credibility. The method jointly optimizes discrepancy loss for domain alignment, embeds space projection analysis, and visualizes salient regions via heatmaps. Evaluated on four benchmark datasets—IAPSa, ArtPhoto, FI, and EMOTIC—our approach achieves accuracies of 61.86%, 62.47%, 70.78%, and 59.72%, respectively, significantly outperforming existing unsupervised and weakly supervised IER methods.
📝 Abstract
This paper proposes a feature-based domain adaptation technique for identifying emotions in generic images, encompassing both facial and non-facial objects, as well as non-human components. This approach addresses the challenge of the limited availability of pre-trained models and well-annotated datasets for Image Emotion Recognition (IER). Initially, a deep-learning-based Facial Expression Recognition (FER) system is developed, classifying facial images into discrete emotion classes. Maintaining the same network architecture, this FER system is then adapted to recognize emotions in generic images through the application of discrepancy loss, enabling the model to effectively learn IER features while classifying emotions into categories such as 'happy,' 'sad,' 'hate,' and 'anger.' Additionally, a novel interpretability method, Divide and Conquer based Shap (DnCShap), is introduced to elucidate the visual features most relevant for emotion recognition. The proposed IER system demonstrated emotion classification accuracies of 61.86% for the IAPSa dataset, 62.47 for the ArtPhoto dataset, 70.78% for the FI dataset, and 59.72% for the EMOTIC dataset. The system effectively identifies the important visual features that lead to specific emotion classifications and also provides detailed embedding plots explaining the predictions, enhancing the understanding and trust in AI-driven emotion recognition systems.
Problem

Research questions and friction points this paper is trying to address.

Domain adaptation for emotion recognition
Interpretability in image emotion classification
Limited annotated datasets for emotion analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Feature-based domain adaptation technique
Discrepancy loss for emotion recognition
Divide and Conquer based Shap interpretability
🔎 Similar Papers
No similar papers found.
P
Puneet Kumar
Center for Machine Vision and Signal Analysis, University of Oulu, Finland
B
B. Raman
Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, India