๐ค AI Summary
Cross-domain facial expression recognition (CD-FER) suffers from severe domain shift between training and deployment data. To address this, we propose a novel framework integrating graph attention with adversarial domain alignment. Specifically, we design a batch-level sparse ring-structured graph to explicitly model inter-sample relationships across domains, and jointly employ gradient reversal layers (GRL), CORAL, and maximum mean discrepancy (MMD) for multi-level distribution alignment. Built upon ResNet-50, our method achieves an average cross-domain accuracy of 74.39% under unsupervised domain adaptation, reaching 98.0% on the RAF-DB โ FER2013 transfer taskโsurpassing baseline methods by approximately 36 percentage points. The core contribution lies in introducing structured graph modeling into CD-FER, which significantly enhances both cross-domain feature discriminability and generalization capability.
๐ Abstract
Cross-domain facial expression recognition (CD-FER) remains difficult due to severe domain shift between training and deployment data. We propose Graph-Attention Network with Adversarial Domain Alignment (GAT-ADA), a hybrid framework that couples a ResNet-50 as backbone with a batch-level Graph Attention Network (GAT) to model inter-sample relations under shift. Each mini-batch is cast as a sparse ring graph so that attention aggregates cross-sample cues that are informative for adaptation. To align distributions, GAT-ADA combines adversarial learning via a Gradient Reversal Layer (GRL) with statistical alignment using CORAL and MMD. GAT-ADA is evaluated under a standard unsupervised domain adaptation protocol: training on one labeled source (RAF-DB) and adapting to multiple unlabeled targets (CK+, JAFFE, SFEW 2.0, FER2013, and ExpW). GAT-ADA attains 74.39% mean cross-domain accuracy. On RAF-DB to FER2013, it reaches 98.0% accuracy, corresponding to approximately a 36-point improvement over the best baseline we re-implemented with the same backbone and preprocessing.