Multi-Domain Biometric Recognition using Body Embeddings

📅 2025-03-13
📈 Citations: 0
Influential: 0
📄 PDF

career value

205K/year
🤖 AI Summary
To address performance degradation in cross-spectral (SWIR/MWIR/LWIR/RGB) human biometric recognition caused by domain discrepancies, this paper proposes body embedding—replacing conventional face embedding—to significantly enhance re-identification robustness across infrared and visible-light domains. Methodologically, we adopt a Vision Transformer architecture jointly optimized with cross-entropy and triplet losses, trained and evaluated on the IARPA IJB-MDF dataset. Key contributions include: (1) the first systematic validation of body embedding’s superiority for multi-infrared-band cross-domain recognition; (2) the construction of the first benchmark framework supporting four-domain matching; and (3) empirical revelation of strong transferability of vision-language pre-trained models under few-shot infrared fine-tuning. Experiments establish new state-of-the-art mAP on the LLCM dataset and set a novel multi-domain cross-spectral person re-identification benchmark on IJB-MDF, with body embeddings outperforming face embeddings notably in MWIR and LWIR bands.

Technology Category

Application Category

📝 Abstract
Biometric recognition becomes increasingly challenging as we move away from the visible spectrum to infrared imagery, where domain discrepancies significantly impact identification performance. In this paper, we show that body embeddings perform better than face embeddings for cross-spectral person identification in medium-wave infrared (MWIR) and long-wave infrared (LWIR) domains. Due to the lack of multi-domain datasets, previous research on cross-spectral body identification - also known as Visible-Infrared Person Re-Identification (VI-ReID) - has primarily focused on individual infrared bands, such as near-infrared (NIR) or LWIR, separately. We address the multi-domain body recognition problem using the IARPA Janus Benchmark Multi-Domain Face (IJB-MDF) dataset, which enables matching of short-wave infrared (SWIR), MWIR, and LWIR images against RGB (VIS) images. We leverage a vision transformer architecture to establish benchmark results on the IJB-MDF dataset and, through extensive experiments, provide valuable insights into the interrelation of infrared domains, the adaptability of VIS-pretrained models, the role of local semantic features in body-embeddings, and effective training strategies for small datasets. Additionally, we show that finetuning a body model, pretrained exclusively on VIS data, with a simple combination of cross-entropy and triplet losses achieves state-of-the-art mAP scores on the LLCM dataset.
Problem

Research questions and friction points this paper is trying to address.

Addresses cross-spectral person identification challenges in infrared domains.
Explores body embeddings for better performance in MWIR and LWIR domains.
Utilizes IJB-MDF dataset for multi-domain body recognition across infrared and visible spectra.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Body embeddings outperform face embeddings in cross-spectral identification.
Vision transformer used for multi-domain infrared and RGB image matching.
Fine-tuning with cross-entropy and triplet losses achieves top mAP scores.
🔎 Similar Papers
2024-05-26IEEE International Conference on Acoustics, Speech, and Signal ProcessingCitations: 3