Learning from Heterogeneity: Generalizing Dynamic Facial Expression Recognition via Distributionally Robust Optimization

📅 2025-07-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address sample heterogeneity arising from multi-source data and inter-individual variability in dynamic facial expression recognition (DFER), this paper proposes the Heterogeneity-Aware Distributionally Robust Framework (HDF) to enhance model generalization. HDF comprises two core components: (i) a Time-Frequency Distribution Attention Module (DAM) that jointly models temporal consistency and frequency-domain robustness; and (ii) a Distribution-Aware Scaling Module (DSM) that integrates information bottleneck principles with gradient sensitivity for robust representation learning. Additionally, a dual-branch attention mechanism and an adaptive optimization strategy are introduced to dynamically balance classification and contrastive losses. Evaluated on DFEW and FERV39k benchmarks, HDF achieves significant improvements in both weighted and unweighted average recall, demonstrating superior robustness and generalization under severe class imbalance.

Technology Category

Application Category

📝 Abstract
Dynamic Facial Expression Recognition (DFER) plays a critical role in affective computing and human-computer interaction. Although existing methods achieve comparable performance, they inevitably suffer from performance degradation under sample heterogeneity caused by multi-source data and individual expression variability. To address these challenges, we propose a novel framework, called Heterogeneity-aware Distributional Framework (HDF), and design two plug-and-play modules to enhance time-frequency modeling and mitigate optimization imbalance caused by hard samples. Specifically, the Time-Frequency Distributional Attention Module (DAM) captures both temporal consistency and frequency robustness through a dual-branch attention design, improving tolerance to sequence inconsistency and visual style shifts. Then, based on gradient sensitivity and information bottleneck principles, an adaptive optimization module Distribution-aware Scaling Module (DSM) is introduced to dynamically balance classification and contrastive losses, enabling more stable and discriminative representation learning. Extensive experiments on two widely used datasets, DFEW and FERV39k, demonstrate that HDF significantly improves both recognition accuracy and robustness. Our method achieves superior weighted average recall (WAR) and unweighted average recall (UAR) while maintaining strong generalization across diverse and imbalanced scenarios. Codes are released at https://github.com/QIcita/HDF_DFER.
Problem

Research questions and friction points this paper is trying to address.

Address performance degradation from sample heterogeneity in DFER
Enhance time-frequency modeling for robust expression recognition
Balance optimization losses to improve representation learning stability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-branch attention for time-frequency modeling
Dynamic loss balancing via gradient sensitivity
Plug-and-play modules for robust recognition
🔎 Similar Papers
No similar papers found.
Feng-Qi Cui
Feng-Qi Cui
University of Science and Technology of China
MultimediaTrustworthy AILLMAI4S
Anyang Tong
Anyang Tong
Hefei University of Technology
J
Jinyang Huang
Hefei University of Technology, Hefei, China
J
Jie Zhang
IHPC and CFAR, Agency for Science, Technology and Research, Singapore, Singapore
Dan Guo
Dan Guo
IEEE senior member, Professor, Hefei University of Technology
Multimedia ComputingArtificial Intelligence
Z
Zhi Liu
The University of Electro-Communications, Tokyo, Japan
M
Meng Wang
Hefei University of Technology, Hefei, China