Person Recognition at Altitude and Range: Fusion of Face, Body Shape and Gait

๐Ÿ“… 2025-05-07
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

214K/year
๐Ÿค– AI Summary
This paper addresses the challenging problem of unconstrained full-body person identification under high-altitude, long-range imaging conditions with severe interferenceโ€”including large pose and scale variations, image degradation, and cross-domain discrepancies. To this end, we propose FarSight, an end-to-end system featuring three key innovations: (1) a novel quality-guided multimodal fusion mechanism; (2) recognition-oriented modules for video restoration, multi-person tracking, and modality-specific feature encoding; and (3) a dynamic weight fusion network coupled with quality-aware confidence modeling. FarSight jointly leverages facial, body shape, and gait features to enhance robustness under degradation. On the BRIAR benchmark, it achieves +34.1% TAR@0.1% FAR in 1:1 verification, +17.8% Rank-20 accuracy in closed-set identification, and โˆ’34.3% FNIR@1% FPIR in open-set identification. The system has also been validated through the NIST 2025 Face Identification in Video Evaluation (FIVE).

Technology Category

Application Category

๐Ÿ“ Abstract
We address the problem of whole-body person recognition in unconstrained environments. This problem arises in surveillance scenarios such as those in the IARPA Biometric Recognition and Identification at Altitude and Range (BRIAR) program, where biometric data is captured at long standoff distances, elevated viewing angles, and under adverse atmospheric conditions (e.g., turbulence and high wind velocity). To this end, we propose FarSight, a unified end-to-end system for person recognition that integrates complementary biometric cues across face, gait, and body shape modalities. FarSight incorporates novel algorithms across four core modules: multi-subject detection and tracking, recognition-aware video restoration, modality-specific biometric feature encoding, and quality-guided multi-modal fusion. These components are designed to work cohesively under degraded image conditions, large pose and scale variations, and cross-domain gaps. Extensive experiments on the BRIAR dataset, one of the most comprehensive benchmarks for long-range, multi-modal biometric recognition, demonstrate the effectiveness of FarSight. Compared to our preliminary system, this system achieves a 34.1% absolute gain in 1:1 verification accuracy (TAR@0.1% FAR), a 17.8% increase in closed-set identification (Rank-20), and a 34.3% reduction in open-set identification errors (FNIR@1% FPIR). Furthermore, FarSight was evaluated in the 2025 NIST RTE Face in Video Evaluation (FIVE), which conducts standardized face recognition testing on the BRIAR dataset. These results establish FarSight as a state-of-the-art solution for operational biometric recognition in challenging real-world conditions.
Problem

Research questions and friction points this paper is trying to address.

Recognizing persons at long distances and high angles
Integrating face, gait, and body shape for identification
Overcoming degraded image conditions and pose variations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fusion of face, gait, and body shape modalities
End-to-end system with four novel core modules
Quality-guided multi-modal fusion under degraded conditions
F
Feng Liu
Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, 48824
N
Nicholas Chimitt
School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, 47907
L
Lanqing Guo
Department of Electrical and Computer Engineering, University of Texas at Austin, Austin, TX, 78712
Jitesh Jain
Jitesh Jain
Georgia Tech
Image SegmentationMultimodal ReasoningComputer Vision
Aditya Kane
Aditya Kane
Georgia Institute of Technology
AI SystemsComputer Vision
M
Minchul Kim
Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, 48824
W
Wes Robbins
Department of Electrical and Computer Engineering, University of Texas at Austin, Austin, TX, 78712
Yiyang Su
Yiyang Su
Michigan State University
Computer Vision
D
Dingqiang Ye
Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, 48824
Xingguang Zhang
Xingguang Zhang
Purdue University
Image and video processingcomputational imagingcomputer visiongenerative models
J
Jie Zhu
Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, 48824
S
Siddharth Satyakam
Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, 48824
C
Christopher Perry
Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, 48824
S
Stanley H. Chan
School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, 47907
Arun Ross
Arun Ross
Professor | Michigan State University
BiometricsComputer VisionPattern RecognitionIris Recognition
Humphrey Shi
Humphrey Shi
Georgia Tech | UIUC || ...
๐‡๐ข๐ ๐ก ๐๐ž๐ซ๐Ÿ๐จ๐ซ๐ฆ๐š๐ง๐œ๐ž ๐€๐ˆComputer VisionMultimodalCreative AIAI Systems
Z
Zhangyang Wang
Department of Electrical and Computer Engineering, University of Texas at Austin, Austin, TX, 78712
A
Anil Jain
Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, 48824
X
Xiaoming Liu
Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, 48824