🤖 AI Summary
Existing video sampling methods disrupt temporal continuity, hindering the capture of discriminative motion patterns that distinguish experts from novices—leading to inaccurate automated assessment of motor skills. To address this, we propose a proficiency-aware adaptive temporal sampling strategy: (1) a novel skill-proficiency-guided dynamic time segmentation mechanism that preserves action-unit integrity while suppressing inter-segment redundancy; and (2) an integrated framework combining multi-view video analysis, action-completeness constrained optimization, and the SkillFormer architecture for unified modeling of highly dynamic, sequential movements. Evaluated on the EgoExo4D benchmark, our method achieves comprehensive improvements over state-of-the-art approaches (+0.65% to +3.05% in proficiency classification accuracy), with particularly notable gains in challenging domains—26.22% in rock climbing, 2.39% in music performance, and 1.13% in basketball—demonstrating significantly enhanced cross-task generalization capability.
📝 Abstract
Automated sports skill assessment requires capturing fundamental movement patterns that distinguish expert from novice performance, yet current video sampling methods disrupt the temporal continuity essential for proficiency evaluation. To this end, we introduce Proficiency-Aware Temporal Sampling (PATS), a novel sampling strategy that preserves complete fundamental movements within continuous temporal segments for multi-view skill assessment. PATS adaptively segments videos to ensure each analyzed portion contains full execution of critical performance components, repeating this process across multiple segments to maximize information coverage while maintaining temporal coherence. Evaluated on the EgoExo4D benchmark with SkillFormer, PATS surpasses the state-of-the-art accuracy across all viewing configurations (+0.65% to +3.05%) and delivers substantial gains in challenging domains (+26.22% bouldering, +2.39% music, +1.13% basketball). Systematic analysis reveals that PATS successfully adapts to diverse activity characteristics-from high-frequency sampling for dynamic sports to fine-grained segmentation for sequential skills-demonstrating its effectiveness as an adaptive approach to temporal sampling that advances automated skill assessment for real-world applications.