π€ AI Summary
This study addresses the challenge of achieving anatomically consistent, high-precision finger biomechanical tracking from monocular videoβa key limitation in daily activity monitoring and joint range-of-motion quantification. The authors propose a novel framework that integrates the SAM 3D Body foundation model with biomechanically constrained inverse kinematics optimization to recover finger joint angles from single-view video within a full-body context. A new mapping from MHR outputs to biomechanical marker positions is established, and the pipeline incorporates PyTorch-to-JAX model conversion, MuJoCo-MJX GPU acceleration, inverse kinematics solving, and Procrustes alignment. Evaluated on 4,590 multi-view frames from seven subjects, the method achieves approximately 10Β° joint angle error and 6 mm hand pose error, demonstrating cross-view consistency and robustness to reference marker generation methods.
π Abstract
Accurate hand and finger tracking from video has significant clinical applications for monitoring activities of daily living and measuring range of motion, yet monocular video approaches for obtaining hand biomechanics remain under-developed. We present a method that combines the SAM 3D Body foundation model with inverse kinematics optimization in a full-body biomechanical model to extract anatomically-constrained finger joint angles from single-view video. We port SAM 3D Body from PyTorch to JAX for integration with MuJoCo-MJX, enabling GPU-accelerated optimization, and develop a novel mapping between the Momentum Human Rig (MHR) outputs and biomechanical model markers. Validation against 8-camera multiview reconstruction on 4,590 frames from 7 participants performing a variety of hand poses and object manipulation tasks shows finger joint angle errors of approximately 10 degrees and hand position errors of approximately 6 mm, after Procrustes alignment. Results were consistent across camera viewpoints and robust to different methods for producing reference values from multiview video. This work extends monocular biomechanical analysis to detailed finger tracking, expanding access to quantitative characterization of hand movement from readily available video.