🤖 AI Summary
To address the lack of professional guidance and difficulty in detecting subtle movement errors in home-based fitness training, this paper proposes the first real-time AI fitness coach system based on vision-language models (VLMs). The system captures user movements via an ordinary RGB camera and, guided by expert-defined scoring criteria, performs end-to-end recognition, deviation quantification, and personalized corrective feedback for 22 strength and flexibility exercises. Key contributions include: (1) the first application of VLMs to fine-grained movement correction; (2) the construction and open release of a benchmark dataset comprising 1,700 high-quality annotated video sequences; and (3) the development of an interpretable motion assessment pipeline, which empirically reveals significant limitations of current VLMs in context-aware movement analysis—establishing a new benchmark and identifying concrete optimization directions for embodied AI fitness systems.
📝 Abstract
Good form is the difference between strength and strain, yet for the fast-growing community of at-home fitness enthusiasts, expert feedback is often out of reach. FormCoach transforms a simple camera into an always-on, interactive AI training partner, capable of spotting subtle form errors and delivering tailored corrections in real time, leveraging vision-language models (VLMs). We showcase this capability through a web interface and benchmark state-of-the-art VLMs on a dataset of 1,700 expert-annotated user-reference video pairs spanning 22 strength and mobility exercises. To accelerate research in AI-driven coaching, we release both the dataset and an automated, rubric-based evaluation pipeline, enabling standardized comparison across models. Our benchmarks reveal substantial gaps compared to human-level coaching, underscoring both the challenges and opportunities in integrating nuanced, context-aware movement analysis into interactive AI systems. By framing form correction as a collaborative and creative process between humans and machines, FormCoach opens a new frontier in embodied AI.