Enhancing Facial Expression Recognition through Dual-Direction Attention Mixed Feature Networks and CLIP: Application to 8th ABAW Challenge

📅 2025-03-15

📈 Citations: 0

✨ Influential: 0

career value

242K/year

🤖 AI Summary

This paper addresses the CVPR 2025 ABAW Challenge, tackling three core facial affect analysis tasks—valence-arousal (VA) estimation, discrete emotion recognition, and facial action unit (AU) detection—within a unified framework. We propose the Dual-Direction Attention Mixed Feature Network (DDAMFN), the first architecture enabling effective cross-task feature reuse while significantly improving performance on all subtasks. To enhance fine-grained affect representation, we integrate the CLIP vision-language model and investigate its transferability to emotion understanding. Furthermore, we design a multi-task feature sharing mechanism coupled with bidirectional attention-based feature fusion to strengthen both representational consistency and discriminability across tasks. Evaluated on the official ABAW 2025 test set, our method achieves substantial improvements over all baselines in all three tasks, demonstrating its effectiveness, robustness, and generalization capability.

Technology Category

Application Category

📝 Abstract

We present our contribution to the 8th ABAW challenge at CVPR 2025, where we tackle valence-arousal estimation, emotion recognition, and facial action unit detection as three independent challenges. Our approach leverages the well-known Dual-Direction Attention Mixed Feature Network (DDAMFN) for all three tasks, achieving results that surpass the proposed baselines. Additionally, we explore the use of CLIP for the emotion recognition challenge as an additional experiment. We provide insights into the architectural choices that contribute to the strong performance of our methods.

Problem

Research questions and friction points this paper is trying to address.

Improves facial expression recognition accuracy

Addresses valence-arousal estimation and emotion recognition

Enhances facial action unit detection performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-Direction Attention Mixed Feature Networks

CLIP for emotion recognition enhancement

Architectural insights for superior performance

🔎 Similar Papers

No similar papers found.