🤖 AI Summary
This study addresses gender fairness generalization in cross-corpus speech emotion recognition (SER), revealing significant performance disparities between male and female speakers under cross-dataset evaluation. To tackle this, we conduct the first systematic investigation of gender fairness transferability across source and target domains, and propose a dual-path fairness adaptation framework that jointly integrates adversarial debiasing and gender-aware feature alignment within a transfer learning paradigm. Experiments across multiple cross-corpus SER benchmarks demonstrate that our method reduces inter-gender equalized odds difference (ΔEO) to ≤0.03—achieving substantial fairness improvement—while preserving baseline emotion classification accuracy. Our core contribution is the establishment of the first principled modeling and adaptation paradigm for gender fairness generalization in cross-corpus SER.
📝 Abstract
Speech emotion recognition (SER) is a vital component in various everyday applications. Cross-corpus SER models are increasingly recognized for their ability to generalize performance. However, concerns arise regarding fairness across demographics in diverse corpora. Existing fairness research often focuses solely on corpus-specific fairness, neglecting its generalizability in cross-corpus scenarios. Our study focuses on this underexplored area, examining the gender fairness generalizability in cross-corpus SER scenarios. We emphasize that the performance of cross-corpus SER models and their fairness are two distinct considerations. Moreover, we propose the approach of a combined fairness adaptation mechanism to enhance gender fairness in the SER transfer learning tasks by addressing both source and target genders. Our findings bring one of the first insights into the generalizability of gender fairness in cross-corpus SER systems.