Language-Guided and Motion-Aware Gait Representation for Generalizable Recognition

📅 2026-01-17

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

Existing gait recognition methods are highly susceptible to static nuisances such as clothing variations and struggle to effectively model dynamic motion characteristics, leading to limited generalization. To address this, this work proposes LMGait, a novel framework that, for the first time, integrates language-guided mechanisms with motion-aware representation learning. By leveraging gait-related textual prompts to steer attention, the model is guided to focus on key dynamic regions, thereby constructing representations that are robust to static interference yet sensitive to motion changes. This approach substantially mitigates overfitting to static noise and significantly improves both accuracy and generalization performance in challenging cross-scenario gait recognition settings.

Technology Category

Application Category

📝 Abstract

Gait recognition is emerging as a promising technology and an innovative field within computer vision, with a wide range of applications in remote human identification. However, existing methods typically rely on complex architectures to directly extract features from images and apply pooling operations to obtain sequence-level representations. Such designs often lead to overfitting on static noise (e.g., clothing), while failing to effectively capture dynamic motion regions, such as the arms and legs. This bottleneck is particularly challenging in the presence of intra-class variation, where gait features of the same individual under different environmental conditions are significantly distant in the feature space. To address the above challenges, we present a Languageguided and Motion-aware gait recognition framework, named LMGait. To the best of our knowledge, LMGait is the first method to introduce natural language descriptions as explicit semantic priors into the gait recognition task. In particular, we utilize designed gait-related language cues to capture key motion features in gait sequences. To improve cross-modal alignment, we propose the Motion Awareness Module (MAM), which refines the language features by adaptively adjusting various levels of semantic information to ensure better alignment with the visual representations. Furthermore, we introduce the Motion Temporal Capture Module (MTCM) to enhance the discriminative capability of gait features and improve the model's motion tracking ability. We conducted extensive experiments across multiple datasets, and the results demonstrate the significant advantages of our proposed network. Specifically, our model achieved accuracies of 88.5%, 97.1%, and 97.5% on the CCPG, SUSTech1K, and CASIAB datasets, respectively, achieving state-of-the-art performance. Homepage: https://dingwu1021.github.io/LMGait/

Problem

Research questions and friction points this paper is trying to address.

gait recognition

overfitting

static noise

dynamic motion

feature representation

Innovation

Methods, ideas, or system contributions that make the work stand out.

language-guided

motion-aware

gait recognition