🤖 AI Summary
This work addresses the limitations of existing gait recognition methods that predominantly rely on a single RGB modality, hindering effective multimodal collaboration and cross-modal retrieval in real-world scenarios. To bridge this gap, we introduce MMGait, a large-scale multimodal gait benchmark encompassing five sensor types, twelve modalities, and 334,060 sequences. We further propose the Omni task, the first unified framework supporting unimodal, cross-modal, and multimodal gait recognition. Based on this benchmark, we develop OmniGait, a baseline model that integrates heterogeneous data from RGB, depth, infrared, LiDAR, and 4D radar sensors within a shared embedding space for multimodal learning. Extensive experiments demonstrate OmniGait’s superior performance in multimodal gait recognition. The dataset, code, and pretrained models are publicly released to foster future research.
📝 Abstract
Gait recognition has emerged as a powerful biometric technique for identifying individuals at a distance without requiring user cooperation. Most existing methods focus primarily on RGB-derived modalities, which fall short in real-world scenarios requiring multi-modal collaboration and cross-modal retrieval. To overcome these challenges, we present MMGait, a comprehensive multi-modal gait benchmark integrating data from five heterogeneous sensors, including an RGB camera, a depth camera, an infrared camera, a LiDAR scanner, and a 4D Radar system. MMGait contains twelve modalities and 334,060 sequences from 725 subjects, enabling systematic exploration across geometric, photometric, and motion domains. Based on MMGait, we conduct extensive evaluations on single-modal, cross-modal, and multi-modal paradigms to analyze modality robustness and complementarity. Furthermore, we introduce a new task, Omni Multi-Modal Gait Recognition, which aims to unify the above three gait recognition paradigms within a single model. We also propose a simple yet powerful baseline, OmniGait, which learns a shared embedding space across diverse modalities and achieves promising recognition performance. The MMGait benchmark, codebase, and pretrained checkpoints are publicly available at https://github.com/BNU-IVC/MMGait.