🤖 AI Summary
Existing 3D human pose estimation datasets exhibit significant limitations in outdoor settings with severe motion blur, heavy occlusion, and multi-view coordination—particularly lacking real-world, high-dynamic sports scenarios such as professional football matches. To address this, we introduce the first World Cup–scale, multi-scene 3D human pose benchmark, comprehensively covering full-cycle football motions. Our method comprises three key innovations: (1) a global pose normalization protocol integrating cross-stadium geometric alignment and sphere-constrained joint localization; (2) a unified modeling paradigm jointly handling dynamic illumination and low-frame-rate video inputs; and (3) an integrated pipeline incorporating multi-camera calibration, motion-compensated optical flow alignment, spherical harmonic lighting modeling, physics-driven motion blur synthesis, and semi-automatic 3D trajectory annotation. Evaluated on six mainstream models, our benchmark yields an average 23.7% reduction in MPJPE, enabling three state-of-the-art methods to surpass the 85 mm accuracy threshold for the first time in complex sports scenes.