FLEX: A Large-Scale Multi-Modal Multi-Action Dataset for Fitness Action Quality Assessment

📅 2025-06-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing action quality assessment (AQA) methods are limited to single-view competitive sports and RGB modalities, rendering them inadequate for risk identification and precise guidance in professional fitness training—especially for resistance-based exercises. To address this, we introduce the first large-scale, multimodal, multi-action dataset tailored for fitness AQA, comprising multi-view RGB videos, high-fidelity 3D pose sequences, surface electromyography (sEMG) signals, and physiological measurements. We pioneer the integration of sEMG into fitness AQA and propose a knowledge-graph-based penalized functional annotation scheme that enables fine-grained modeling of critical movement phases, error types, and corrective feedback. Experimental results demonstrate that multimodality, multi-view inputs, and structured annotations significantly improve assessment accuracy. This work establishes a new benchmark and technical foundation for AI-driven, evidence-based fitness coaching.

Technology Category

Application Category

📝 Abstract
With the increasing awareness of health and the growing desire for aesthetic physique, fitness has become a prevailing trend. However, the potential risks associated with fitness training, especially with weight-loaded fitness actions, cannot be overlooked. Action Quality Assessment (AQA), a technology that quantifies the quality of human action and provides feedback, holds the potential to assist fitness enthusiasts of varying skill levels in achieving better training outcomes. Nevertheless, current AQA methodologies and datasets are limited to single-view competitive sports scenarios and RGB modality and lack professional assessment and guidance of fitness actions. To address this gap, we propose the FLEX dataset, the first multi-modal, multi-action, large-scale dataset that incorporates surface electromyography (sEMG) signals into AQA. FLEX utilizes high-precision MoCap to collect 20 different weight-loaded actions performed by 38 subjects across 3 different skill levels for 10 repetitions each, containing 5 different views of the RGB video, 3D pose, sEMG, and physiological information. Additionally, FLEX incorporates knowledge graphs into AQA, constructing annotation rules in the form of penalty functions that map weight-loaded actions, action keysteps, error types, and feedback. We conducted various baseline methodologies on FLEX, demonstrating that multimodal data, multiview data, and fine-grained annotations significantly enhance model performance. FLEX not only advances AQA methodologies and datasets towards multi-modal and multi-action scenarios but also fosters the integration of artificial intelligence within the fitness domain. Dataset and code are available at https://haoyin116.github.io/FLEX_Dataset.
Problem

Research questions and friction points this paper is trying to address.

Addressing lack of multi-modal datasets for fitness action quality assessment
Incorporating sEMG and multi-view data to improve AQA accuracy
Providing professional fitness feedback via knowledge graphs and penalty functions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-modal dataset with sEMG and MoCap
Incorporates knowledge graphs for AQA
Utilizes multiview RGB and 3D pose
Hao Yin
Hao Yin
Meta Platforms Inc.
Wireless communicationOptimizationMachine Learning
L
Lijun Gu
School of Biomedical Engineering (Suzhou), USTC
Paritosh Parmar
Paritosh Parmar
University of British Columbia
Computer VisionVideo UnderstandingVisual Question AnsweringAction Quality Assessment
L
Lin Xu
School of Psychology, Beijing Sports University
T
Tianxiao Guo
School of Competitive Sports, Beijing Sports University
Weiwei Fu
Weiwei Fu
Fudan University
data assimilationinverse modelbiogeochemical cycles
Y
Yang Zhang
Suzhou Institute of Biomedical Engineering and Technology, CAS
T
Tianyou Zheng
Suzhou Institute of Biomedical Engineering and Technology, CAS