🤖 AI Summary
To address the issues of motion rigidity and biomechanical implausibility in text-to-sign-language video generation, this paper proposes a geometry-aware skeletal modeling paradigm. It explicitly encodes geometric relationships among joints—including shoulders, arms, and hands—by introducing a bone pose loss, bone length constraints, and a parent-node-relative reweighting mechanism, jointly optimizing joint positions, bone-length consistency, and motion dynamic coordination within an end-to-end training framework. The method rigorously enforces human anatomical constraints, significantly improving finger dexterity and overall motion naturalness. Experiments demonstrate that our approach reduces the performance gap between generated and ground-truth sign language videos by 56.51% relative to baselines; bone length deviation and motion variance decrease by 18.76% and 5.48%, respectively. These results mark substantial advances in both anatomical plausibility and motion naturalness for sign language generation.
📝 Abstract
Sign language translation from text to video plays a crucial role in enabling effective communication for Deaf and hard--of--hearing individuals. A major challenge lies in generating accurate and natural body poses and movements that faithfully convey intended meanings. Prior methods often neglect the anatomical constraints and coordination patterns of human skeletal motion, resulting in rigid or biomechanically implausible outputs. To address this, we propose a novel approach that explicitly models the relationships among skeletal joints--including shoulders, arms, and hands--by incorporating geometric constraints on joint positions, bone lengths, and movement dynamics. During training, we introduce a parent-relative reweighting mechanism to enhance finger flexibility and reduce motion stiffness. Additionally, bone-pose losses and bone-length constraints enforce anatomically consistent structures. Our method narrows the performance gap between the previous best and the ground-truth oracle by 56.51%, and further reduces discrepancies in bone length and movement variance by 18.76% and 5.48%, respectively, demonstrating significant gains in anatomical realism and motion naturalness.