Navigating Label Ambiguity for Facial Expression Recognition in the Wild

📅 2025-02-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the dual challenges of label ambiguity and class imbalance in real-world facial expression recognition (FER). We propose the first end-to-end training framework that dynamically perceives and handles ambiguous samples. Methodologically, we jointly model label uncertainty and class distribution shift—introducing a Noise-Aware Adaptive Weighting (NAW) mechanism that integrates intermediate prediction correlation modeling and latent-space consistency regularization, complemented by an iterative hard-sample focusing strategy. Extensive experiments on challenging in-the-wild benchmarks—including RAF-DB and AffectNet—demonstrate substantial improvements in overall accuracy and mean average precision (mAP), particularly for minority classes. Our approach achieves superior robustness compared to current state-of-the-art methods, validating its effectiveness in mitigating both ambiguity-induced noise and imbalance-induced bias in practical FER settings.

Technology Category

Application Category

📝 Abstract
Facial expression recognition (FER) remains a challenging task due to label ambiguity caused by the subjective nature of facial expressions and noisy samples. Additionally, class imbalance, which is common in real-world datasets, further complicates FER. Although many studies have shown impressive improvements, they typically address only one of these issues, leading to suboptimal results. To tackle both challenges simultaneously, we propose a novel framework called Navigating Label Ambiguity (NLA), which is robust under real-world conditions. The motivation behind NLA is that dynamically estimating and emphasizing ambiguous samples at each iteration helps mitigate noise and class imbalance by reducing the model's bias toward majority classes. To achieve this, NLA consists of two main components: Noise-aware Adaptive Weighting (NAW) and consistency regularization. Specifically, NAW adaptively assigns higher importance to ambiguous samples and lower importance to noisy ones, based on the correlation between the intermediate prediction scores for the ground truth and the nearest negative. Moreover, we incorporate a regularization term to ensure consistent latent distributions. Consequently, NLA enables the model to progressively focus on more challenging ambiguous samples, which primarily belong to the minority class, in the later stages of training. Extensive experiments demonstrate that NLA outperforms existing methods in both overall and mean accuracy, confirming its robustness against noise and class imbalance. To the best of our knowledge, this is the first framework to address both problems simultaneously.
Problem

Research questions and friction points this paper is trying to address.

Addresses label ambiguity in facial expression recognition
Mitigates class imbalance in real-world datasets
Proposes a novel framework to handle both challenges
Innovation

Methods, ideas, or system contributions that make the work stand out.

Noise-aware Adaptive Weighting
consistency regularization
dynamically estimating ambiguous samples
🔎 Similar Papers
No similar papers found.
J
JunGyu Lee
Korea Institute of Science and Technology, Seoul, Korea; AI-Robotics, KIST School, University of Science and Technology, Daejeon, Korea
Yeji Choi
Yeji Choi
DI Lab Inc.
weather and climateprecipitationremote sensingdeep learningimage segmentation
H
Haksub Kim
Korea Institute of Science and Technology, Seoul, Korea
Ig-Jae Kim
Ig-Jae Kim
KIST
Deep LearningComputer GraphicsComputer VisionImage Processing
G
G. Nam
Korea Institute of Science and Technology, Seoul, Korea; AI-Robotics, KIST School, University of Science and Technology, Daejeon, Korea