Gate-Shift-Pose: Enhancing Action Recognition in Sports with Skeleton Information

📅 2025-03-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the fine-grained action classification task of fall detection in figure skating. We propose a dual-path multimodal framework that fuses RGB frames and skeleton keypoint Gaussian heatmaps. A novel gated shift-enhancement architecture is introduced to jointly enable early fusion (input-level heatmap concatenation) and late fusion (cross-modal attention-driven multi-stream feature fusion). To our knowledge, this is the first systematic validation demonstrating the decisive performance gain conferred by the skeleton modality for discriminative analysis of complex on-ice movements. Evaluated on our newly constructed FR-FS dataset, the model achieves 98.08% accuracy using ResNet18 as backbone—outperforming the RGB-only baseline by 40%. With ResNet50, it still yields a 20% improvement. These results significantly advance the state of fine-grained sports action recognition.

Technology Category

Application Category

📝 Abstract
This paper introduces Gate-Shift-Pose, an enhanced version of Gate-Shift-Fuse networks, designed for athlete fall classification in figure skating by integrating skeleton pose data alongside RGB frames. We evaluate two fusion strategies: early-fusion, which combines RGB frames with Gaussian heatmaps of pose keypoints at the input stage, and late-fusion, which employs a multi-stream architecture with attention mechanisms to combine RGB and pose features. Experiments on the FR-FS dataset demonstrate that Gate-Shift-Pose significantly outperforms the RGB-only baseline, improving accuracy by up to 40% with ResNet18 and 20% with ResNet50. Early-fusion achieves the highest accuracy (98.08%) with ResNet50, leveraging the model's capacity for effective multimodal integration, while late-fusion is better suited for lighter backbones like ResNet18. These results highlight the potential of multimodal architectures for sports action recognition and the critical role of skeleton pose information in capturing complex motion patterns.
Problem

Research questions and friction points this paper is trying to address.

Enhances action recognition in sports using skeleton data.
Compares early-fusion and late-fusion strategies for multimodal integration.
Improves athlete fall classification accuracy in figure skating.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates skeleton pose data with RGB frames
Uses early-fusion and late-fusion strategies
Leverages attention mechanisms for feature combination
🔎 Similar Papers
No similar papers found.