🤖 AI Summary
Existing gait models suffer from limited scalability and poor generalization due to small-scale architectures, facing persistent bottlenecks in scalability and task fragmentation. This paper introduces FoundationGait, the first scalable gait foundation model: a contour-based self-supervised contrastive learning framework that jointly pretrains a large-scale Transformer on 12 public datasets comprising over 2 million walking sequences. We empirically establish, for the first time, scaling laws for gait representation learning, enabling zero-shot cross-dataset, cross-task, and cross-modal transfer. FoundationGait unifies diverse downstream applications—including person identification, scoliosis screening, and depression prediction—achieving zero-shot Rank-1 accuracies of 48.0% on Gait3D and 64.5% on OU-MVLP, significantly enhancing robustness in complex, real-world scenarios.
📝 Abstract
Gait patterns play a critical role in human identification and healthcare analytics, yet current progress remains constrained by small, narrowly designed models that fail to scale or generalize. Building a unified gait foundation model requires addressing two longstanding barriers: (a) Scalability. Why have gait models historically failed to follow scaling laws? (b) Generalization. Can one model serve the diverse gait tasks that have traditionally been studied in isolation? We introduce FoundationGait, the first scalable, self-supervised pretraining framework for gait understanding. Its largest version has nearly 0.13 billion parameters and is pretrained on 12 public gait datasets comprising over 2 million walking sequences. Extensive experiments demonstrate that FoundationGait, with or without fine-tuning, performs robustly across a wide spectrum of gait datasets, conditions, tasks (e.g., human identification, scoliosis screening, depression prediction, and attribute estimation), and even input modality. Notably, it achieves 48.0% zero-shot rank-1 accuracy on the challenging in-the-wild Gait3D dataset (1,000 test subjects) and 64.5% on the largest in-the-lab OU-MVLP dataset (5,000+ test subjects), setting a new milestone in robust gait recognition. Coming code and model: https://github.com/ShiqiYu/OpenGait.