🤖 AI Summary
This study addresses the challenge of long-horizon 3D full-body pose prediction during dynamic load-handling tasks. We propose a biomechanically informed pose forecasting method that integrates physics-based constraints with advanced temporal modeling. Specifically, we design a dual-architecture model combining Transformer and bidirectional LSTM (BLSTM), conditioned on hand-load position, lifting strategy, subject anthropometry, and initial pose sequences. To enforce anatomical plausibility, we introduce a novel constant-bone-length constraint loss that explicitly preserves skeletal segment lengths across joints. Experimental results demonstrate that this loss reduces prediction errors by 8% for arms and 21% for legs. Moreover, the Transformer architecture achieves a 58% improvement over BLSTM in long-horizon prediction, attaining a root-mean-square error of 47.0 mm. Our approach significantly enhances both accuracy and generalizability of human kinematic modeling under dynamic loading conditions.
📝 Abstract
This study aimed to explore the application of deep neural networks for whole-body human posture prediction during dynamic load-reaching activities. Two time-series models were trained using bidirectional long short-term memory (BLSTM) and transformer architectures. The dataset consisted of 3D full-body plug-in gait dynamic coordinates from 20 normal-weight healthy male individuals each performing 204 load-reaching tasks from different load positions while adapting various lifting and handling techniques. The model inputs consisted of the 3D position of the hand-load position, lifting (stoop, full-squat and semi-squat) and handling (one- and two-handed) techniques, body weight and height, and the 3D coordinate data of the body posture from the first 25% of the task duration. These inputs were used by the models to predict body coordinates during the remaining 75% of the task period. Moreover, a novel method was proposed to improve the accuracy of the previous and present posture prediction networks by enforcing constant body segment lengths through the optimization of a new cost function. The results indicated that the new cost function decreased the prediction error of the models by approximately 8% and 21% for the arm and leg models, respectively. We indicated that utilizing the transformer architecture, with a root-mean-square-error of 47.0 mm, exhibited ~58% more accurate long-term performance than the BLSTM-based model. This study merits the use of neural networks that capture time series dependencies in 3D motion frames, providing a unique approach for understanding and predict motion dynamics during manual material handling activities.