AG-VPReID.VIR: Bridging Aerial and Ground Platforms for Video-based Visible-Infrared Person Re-ID

📅 2025-07-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing cross-modal person re-identification (Re-ID) datasets are limited to ground-level views, hindering their applicability to all-day, cross-platform scenarios—particularly aerial-to-ground settings. Method: We introduce AG-VPReID.VIR, the first aerial-ground cross-modal video person Re-ID dataset, comprising 1,837 identities and 4,861 trajectories, supporting day/night operation, cross-view matching, cross-modal alignment, and dynamic spatiotemporal association. We further propose TCC-VPReID, a three-stream network integrating style-robust feature learning, memory-augmented cross-view adaptation, and temporal mediator-guided modeling to jointly address platform heterogeneity and modality discrepancy. Contribution/Results: Extensive experiments under multiple evaluation protocols demonstrate significant performance gains over state-of-the-art methods, validating both the dataset’s inherent challenge and the model’s effectiveness. AG-VPReID.VIR establishes a new benchmark for all-weather, cross-perspective intelligent perception and provides a principled technical pathway toward robust aerial-ground Re-ID.

Technology Category

Application Category

📝 Abstract
Person re-identification (Re-ID) across visible and infrared modalities is crucial for 24-hour surveillance systems, but existing datasets primarily focus on ground-level perspectives. While ground-based IR systems offer nighttime capabilities, they suffer from occlusions, limited coverage, and vulnerability to obstructions--problems that aerial perspectives uniquely solve. To address these limitations, we introduce AG-VPReID.VIR, the first aerial-ground cross-modality video-based person Re-ID dataset. This dataset captures 1,837 identities across 4,861 tracklets (124,855 frames) using both UAV-mounted and fixed CCTV cameras in RGB and infrared modalities. AG-VPReID.VIR presents unique challenges including cross-viewpoint variations, modality discrepancies, and temporal dynamics. Additionally, we propose TCC-VPReID, a novel three-stream architecture designed to address the joint challenges of cross-platform and cross-modality person Re-ID. Our approach bridges the domain gaps between aerial-ground perspectives and RGB-IR modalities, through style-robust feature learning, memory-based cross-view adaptation, and intermediary-guided temporal modeling. Experiments show that AG-VPReID.VIR presents distinctive challenges compared to existing datasets, with our TCC-VPReID framework achieving significant performance gains across multiple evaluation protocols. Dataset and code are available at https://github.com/agvpreid25/AG-VPReID.VIR.
Problem

Research questions and friction points this paper is trying to address.

Bridging aerial and ground platforms for cross-modality person Re-ID
Addressing occlusion and coverage limits in ground-based IR surveillance
Solving cross-viewpoint and modality discrepancies in video Re-ID
Innovation

Methods, ideas, or system contributions that make the work stand out.

First aerial-ground cross-modality Re-ID dataset
Three-stream architecture for cross-platform Re-ID
Style-robust feature learning and temporal modeling
🔎 Similar Papers
No similar papers found.
H
Huy Nguyen
School of Electrical Engineering and Robotics, Queensland University of Technology
Kien Nguyen
Kien Nguyen
Institute for Advanced Academic Research & Graduate School of Informatics, Chiba University
IoTnetworkingwirelessnetwork virtualizationSDN
A
Akila Pemasiri
School of Electrical Engineering and Robotics, Queensland University of Technology
A
Akmal Jahan
School of Electrical Engineering and Robotics, Queensland University of Technology
Clinton Fookes
Clinton Fookes
Queensland University of Technology
Computer VisionMachine LearningSignal ProcessingAIVideo Analytics/Biometrics/Medical Imaging
Sridha Sridharan
Sridha Sridharan
Professor
computer visionmachine learningspeaker recognitionbiometricsimage processing