🤖 AI Summary
This work addresses critical limitations in radar-based facial authentication and expression recognition—namely, inadequate privacy preservation, insufficient real-time performance, and poor robustness to out-of-distribution (OOD) samples. We propose the first end-to-end 60 GHz FMCW radar framework for these tasks. Our method introduces a multi-encoder–decoder OOD authentication architecture for single-subject identity verification, and integrates micro-range Doppler feature fusion with a dual MobileViT mechanism that jointly models dynamic and static facial cues. Range-Doppler and micro-range-Doppler representations are combined, enhanced via ResNet-inspired feature extraction, and classified using lightweight MobileViT. Experiments demonstrate state-of-the-art performance: 94.13% AUROC for OOD detection (FPR95 = 18.12%), 94.70% mean accuracy for facial expression recognition, and real-time inference capability—surpassing existing SOTA OOD detection and Transformer-based approaches.
📝 Abstract
Out-of-distribution (OOD) detection is essential for the safe deployment of neural networks, as it enables the identification of samples outside the training domain. We present FOODER, a real-time, privacy-preserving radar-based framework that integrates OOD-based facial authentication with facial expression recognition. FOODER operates using low-cost frequency-modulated continuous-wave (FMCW) radar and exploits both range-Doppler and micro range-Doppler representations. The authentication module employs a multi-encoder multi-decoder architecture with Body Part (BP) and Intermediate Linear Encoder-Decoder (ILED) components to classify a single enrolled individual as in-distribution while detecting all other faces as OOD. Upon successful authentication, an expression recognition module is activated. Concatenated radar representations are processed by a ResNet block to distinguish between dynamic and static facial expressions. Based on this categorization, two specialized MobileViT networks are used to classify dynamic expressions (smile, shock) and static expressions (neutral, anger). This hierarchical design enables robust facial authentication and fine-grained expression recognition while preserving user privacy by relying exclusively on radar data. Experiments conducted on a dataset collected with a 60 GHz short-range FMCW radar demonstrate that FOODER achieves an AUROC of 94.13% and an FPR95 of 18.12% for authentication, along with an average expression recognition accuracy of 94.70%. FOODER outperforms state-of-the-art OOD detection methods and several transformer-based architectures while operating efficiently in real time.