🤖 AI Summary
This work proposes E2E-GNet, an end-to-end geometric deep neural network designed to address the challenges of modeling skeleton sequences in non-Euclidean spaces. By integrating geometric transformation layers with differentiable logarithmic map activations, the method projects skeleton motion sequences from non-Euclidean manifolds into a linear space. It further introduces, for the first time, a distortion-aware optimization mechanism that effectively preserves essential geometric structures during joint optimization. Combining geometric deep learning, differentiable logarithmic mapping, and distortion-aware optimization, E2E-GNet enables efficient end-to-end modeling of skeleton sequences. Extensive experiments on five cross-domain datasets demonstrate that the proposed approach significantly outperforms existing methods, achieving higher action recognition accuracy while reducing computational overhead.
📝 Abstract
Geometric deep learning has recently gained significant attention in the computer vision community for its ability to capture meaningful representations of data lying in a non-Euclidean space. To this end, we propose E2E-GNet, an end-to-end geometric deep neural network for skeleton-based human motion recognition. To enhance the discriminative power between different motions in the non-Euclidean space, E2E-GNet introduces a geometric transformation layer that jointly optimizes skeleton motion sequences on this space and applies a differentiable logarithm map activation to project them onto a linear space. Building on this, we further design a distortion-aware optimization layer that limits skeleton shape distortions caused by this projection, enabling the network to retain discriminative geometric cues and achieve a higher motion recognition rate. We demonstrate the impact of each layer through ablation studies and extensive experiments across five datasets spanning three domains show that E2E-GNet outperforms other methods with lower cost.