🤖 AI Summary
Addressing three key challenges in facial Action Unit (AU) detection—modeling subtle AU variations, severe class imbalance, and label noise—this paper proposes the first contrastive learning framework that jointly leverages self-supervised and supervised signals. Methodologically, it introduces a tripartite positive-pair sampling strategy to mitigate label noise; employs a negative-sample reweighting scheme to alleviate class imbalance; and enables end-to-end deep feature learning. Evaluated on five major benchmarks—BP4D, DISFA, BP4D+, GFT, and Aff-Wild2—the method achieves significant improvements over existing state-of-the-art approaches. Results demonstrate superior fine-grained AU discrimination, robustness to noisy labels, and strong generalization across diverse datasets and AU configurations.
📝 Abstract
For the Facial Action Unit (AU) detection task, accurately capturing the subtle facial differences between distinct AUs is essential for reliable detection. Additionally, AU detection faces challenges from class imbalance and the presence of noisy or false labels, which undermine detection accuracy. In this paper, we introduce a novel contrastive learning framework aimed for AU detection that incorporates both self-supervised and supervised signals, thereby enhancing the learning of discriminative features for accurate AU detection. To tackle the class imbalance issue, we employ a negative sample re-weighting strategy that adjusts the step size of updating parameters for minority and majority class samples. Moreover, to address the challenges posed by noisy and false AU labels, we employ a sampling technique that encompasses three distinct types of positive sample pairs. This enables us to inject self-supervised signals into the supervised signal, effectively mitigating the adverse effects of noisy labels. Our experimental assessments, conducted on five widely-utilized benchmark datasets (BP4D, DISFA, BP4D+, GFT and Aff-Wild2), underscore the superior performance of our approach compared to state-of-the-art methods of AU detection.