2D-3D Attention and Entropy for Pose Robust 2D Facial Recognition

📅 2025-05-14

📈 Citations: 0

✨ Influential: 0

career value

253K/year

🤖 AI Summary

To address the significant performance degradation of 2D face recognition under large pose variations, this paper proposes a pose-invariant 2D–3D cross-modal domain adaptation framework. Methodologically, it introduces (1) a novel shared cross-modal attention mapping mechanism that explicitly models fine-grained correspondences between 2D image and 3D point cloud representations, and (2) a joint entropy regularization loss that simultaneously enforces distributional consistency and discriminability across both modalities. Crucially, the method operates without 3D label supervision, leveraging 3D point clouds as geometric guidance to learn pose-robust 2D representations. Evaluated on FaceScape and ARL-VTF, it achieves absolute improvements of 7.1% and 1.57% in TAR@1%FAR for 90° profile-face recognition, respectively—substantially outperforming current state-of-the-art approaches.

Technology Category

Application Category

📝 Abstract

Despite recent advances in facial recognition, there remains a fundamental issue concerning degradations in performance due to substantial perspective (pose) differences between enrollment and query (probe) imagery. Therefore, we propose a novel domain adaptive framework to facilitate improved performances across large discrepancies in pose by enabling image-based (2D) representations to infer properties of inherently pose invariant point cloud (3D) representations. Specifically, our proposed framework achieves better pose invariance by using (1) a shared (joint) attention mapping to emphasize common patterns that are most correlated between 2D facial images and 3D facial data and (2) a joint entropy regularizing loss to promote better consistency$unicode{x2014}$enhancing correlations among the intersecting 2D and 3D representations$unicode{x2014}$by leveraging both attention maps. This framework is evaluated on FaceScape and ARL-VTF datasets, where it outperforms competitive methods by achieving profile (90$unicode{x00b0}$$unicode{x002b}$) TAR @ 1$unicode{x0025}$ FAR improvements of at least 7.1$unicode{x0025}$ and 1.57$unicode{x0025}$, respectively.

Problem

Research questions and friction points this paper is trying to address.

Improving 2D facial recognition under large pose variations

Bridging 2D and 3D representations for pose invariance

Enhancing cross-pose performance via attention and entropy regularization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Joint 2D-3D attention mapping for common patterns

Entropy loss enhances 2D-3D representation consistency

Domain adaptation improves pose-invariant facial recognition

🔎 Similar Papers

No similar papers found.