HadaSmileNet: Hadamard fusion of handcrafted and deep-learning features for enhancing facial emotion recognition of genuine smiles

📅 2025-09-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the fine-grained discrimination between genuine and posed smiles—a critical challenge in affective computing. We propose a physiology-inspired multimodal fusion framework that jointly leverages handcrafted D-Marker–based features and Transformer-derived deep representations. Crucially, we introduce a parameter-free Hadamard multiplicative fusion mechanism to enable efficient and interpretable interaction between these complementary feature modalities. This design avoids introducing additional learnable parameters, reducing model complexity by 26% while enhancing discriminative capacity. Extensive evaluations on four benchmark datasets—UvA-NEMO, MMI, SPOS, and BBC—achieve state-of-the-art accuracies of 88.7%, 99.7%, 98.5%, and 100%, respectively. These results comprehensively surpass existing methods and empirically validate the efficacy of integrating physiological priors with deep learning for robust smile authentication.

Technology Category

Application Category

📝 Abstract
The distinction between genuine and posed emotions represents a fundamental pattern recognition challenge with significant implications for data mining applications in social sciences, healthcare, and human-computer interaction. While recent multi-task learning frameworks have shown promise in combining deep learning architectures with handcrafted D-Marker features for smile facial emotion recognition, these approaches exhibit computational inefficiencies due to auxiliary task supervision and complex loss balancing requirements. This paper introduces HadaSmileNet, a novel feature fusion framework that directly integrates transformer-based representations with physiologically grounded D-Markers through parameter-free multiplicative interactions. Through systematic evaluation of 15 fusion strategies, we demonstrate that Hadamard multiplicative fusion achieves optimal performance by enabling direct feature interactions while maintaining computational efficiency. The proposed approach establishes new state-of-the-art results for deep learning methods across four benchmark datasets: UvA-NEMO (88.7 percent, +0.8), MMI (99.7 percent), SPOS (98.5 percent, +0.7), and BBC (100 percent, +5.0). Comprehensive computational analysis reveals 26 percent parameter reduction and simplified training compared to multi-task alternatives, while feature visualization demonstrates enhanced discriminative power through direct domain knowledge integration. The framework's efficiency and effectiveness make it particularly suitable for practical deployment in multimedia data mining applications that require real-time affective computing capabilities.
Problem

Research questions and friction points this paper is trying to address.

Distinguishing genuine from posed smiles in facial emotion recognition
Addressing computational inefficiencies in multi-task learning frameworks
Integrating deep learning with handcrafted features for affective computing
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hadamard multiplicative fusion of transformer and D-Marker features
Parameter-free feature integration for computational efficiency
Direct domain knowledge fusion without auxiliary task supervision
🔎 Similar Papers
No similar papers found.