🤖 AI Summary
To address the challenge of precisely localizing and interpreting movement errors in rehabilitation training, this paper proposes a contrastive learning–based action error correction framework. Methodologically, it integrates human pose estimation, graph attention networks, and Transformer-based interpretability mechanisms to jointly encode correct/incorrect motion pairs, establishing the first contrastive, interpretable modeling paradigm tailored for rehabilitation. Additionally, we introduce ExeCheck—the first paired rehabilitation action dataset enabling joint-level, fine-grained error localization. Experimental results demonstrate that our approach significantly outperforms sequence-alignment baselines on both ExeCheck and UI-PRMD, accurately identifying biomechanically relevant erroneous joints. This yields more precise, clinically actionable feedback and enhances model transparency for therapeutic decision-making.
📝 Abstract
In this paper, we present a contrastive learning based framework, ExeChecker, for the interpretation of rehabilitation exercises. Our work builds upon state-of-the-art advances in the area of human pose estimation, graph-attention neural networks, and transformer interpretablity. The downstream task is to assist rehabilitation by providing informative feedback to users while they are performing prescribed exercises. We utilize a contrastive learning strategy during training. Given a tuple of correctly and incorrectly executed exercises, our model is able to identify and highlight those joints that are involved in an incorrect movement and thus require the user's attention. We collected an in-house dataset, ExeCheck, with paired recordings of both correct and incorrect execution of exercises. In our experiments, we tested our method on this dataset as well as the UI-PRMD dataset and found ExeCheck outperformed the baseline method using pairwise sequence alignment in identifying joints of physical relevance in rehabilitation exercises.