Decoupled Doubly Contrastive Learning for Cross Domain Facial Action Unit Detection.

📅 2025-03-05

🏛️ IEEE Transactions on Image Processing

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

To address performance degradation in cross-domain facial Action Unit (AU) detection caused by domain shift, this paper proposes the Decoupled Dual-Contrastive Adaptation (D2CA) framework. Methodologically, D2CA introduces the first automatic disentanglement mechanism separating AU-specific and domain-specific factors, thereby partitioning feature subspaces into AU-relevant and AU-irrelevant components; it further integrates image-level and feature-level dual contrastive learning to achieve semantic alignment and scale-controllable cross-domain face synthesis. In terms of contributions, D2CA is the first to jointly model feature disentanglement, dual contrastive learning, and AU-conditioned domain adaptation. Extensive experiments across multiple cross-domain settings demonstrate an average F1-score improvement of 6–14% over state-of-the-art methods. Moreover, the synthesized faces exhibit both high visual fidelity and strong AU semantic preservation.

Technology Category

Application Category

📝 Abstract

Despite the impressive performance of current vision-based facial action unit (AU) detection approaches, they are heavily susceptible to the variations across different domains and the cross-domain AU detection methods are under-explored. In response to this challenge, we propose a decoupled doubly contrastive adaptation (D2CA) approach to learn a purified AU representation that is semantically aligned for the source and target domains. Specifically, we decompose latent representations into AU-relevant and AU-irrelevant components, with the objective of exclusively facilitating adaptation within the AU-relevant subspace. To achieve the feature decoupling, D2CA is trained to disentangle AU and domain factors by assessing the quality of synthesized faces in cross-domain scenarios when either AU or domain attributes are modified. To further strengthen feature decoupling, particularly in scenarios with limited AU data diversity, D2CA employs a doubly contrastive learning mechanism comprising image and feature-level contrastive learning to ensure the quality of synthesized faces and mitigate feature ambiguities. This new framework leads to an automatically learned, dedicated separation of AU-relevant and domain-relevant factors, and it enables intuitive, scale-specific control of the cross-domain facial image synthesis. Extensive experiments demonstrate the efficacy of D2CA in successfully decoupling AU and domain factors, yielding visually pleasing cross-domain synthesized facial images. Meanwhile, D2CA consistently outperforms state-of-the-art cross-domain AU detection approaches, achieving an average F1 score improvement of 6%-14% across various cross-domain scenarios.

Problem

Research questions and friction points this paper is trying to address.

Addresses cross-domain facial action unit detection challenges

Proposes decoupled doubly contrastive adaptation for purified AU representation

Enhances feature decoupling with contrastive learning mechanisms

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decoupled doubly contrastive adaptation for AU detection

Feature decoupling into AU-relevant and irrelevant components

Doubly contrastive learning enhances cross-domain synthesis

🔎 Similar Papers

Learning Contrastive Feature Representations for Facial Action Unit Detection