🤖 AI Summary
This work addresses the severe performance degradation in cross-view (aerial-to-ground) person re-identification under extreme long-range conditions, caused by drastic resolution loss, significant viewpoint shifts, motion blur, and clothing variations. To this end, we formally define the task for the first time and introduce VReID-XFD, a new video benchmark and challenge built upon the DetReIDX dataset, which encompasses 371 identities and 11,288 trajectories. The benchmark features a rigorously identity-disjoint evaluation protocol, multi-perspective captures from altitudes of 5.8–120 meters with俯角 ranging from 30° to 90°, trajectory-level annotations, and rich physical metadata. The accompanying challenge attracted 10 participating teams, with the top-performing method, SAS-PReID, achieving only 43.93% mAP—highlighting the task’s difficulty and the limitations of current approaches, thereby establishing a foundation for future research.
📝 Abstract
Person re-identification (ReID) across aerial and ground views at extreme far distances introduces a distinct operating regime where severe resolution degradation, extreme viewpoint changes, unstable motion cues, and clothing variation jointly undermine the appearance-based assumptions of existing ReID systems. To study this regime, we introduce VReID-XFD, a video-based benchmark and community challenge for extreme far-distance (XFD) aerial-to-ground person re-identification. VReID-XFD is derived from the DetReIDX dataset and comprises 371 identities, 11,288 tracklets, and 11.75 million frames, captured across altitudes from 5.8 m to 120 m, viewing angles from oblique (30 degrees) to nadir (90 degrees), and horizontal distances up to 120 m. The benchmark supports aerial-to-aerial, aerial-to-ground, and ground-to-aerial evaluation under strict identity-disjoint splits, with rich physical metadata. The VReID-XFD-25 Challenge attracted 10 teams with hundreds of submissions. Systematic analysis reveals monotonic performance degradation with altitude and distance, a universal disadvantage of nadir views, and a trade-off between peak performance and robustness. Even the best-performing SAS-PReID method achieves only 43.93 percent mAP in the aerial-to-ground setting. The dataset, annotations, and official evaluation protocols are publicly available at https://www.it.ubi.pt/DetReIDX/ .