🤖 AI Summary
This work proposes a novel approach to reconstruct 3D infant poses from monocular video and faithfully retarget the resulting motions to both physical and simulated humanoid platforms—such as iCub and pyCub—while simultaneously generating synchronized multimodal sensory streams, including proprioception, touch, and vision. In contrast to existing motion retargeting methods that merely replicate kinematics, this framework enables the first end-to-end simulation of first-person, multimodal sensorimotor experiences characteristic of early infant development. The system achieves sub-centimeter retargeting accuracy on optimal platforms, substantially enhancing the capacity for automated analysis of infant behavior. By bridging real-world observations with embodied simulation, the method offers a powerful new tool for developmental science and the early detection of neurodevelopmental disorders.
📝 Abstract
Motion retargeting from humans to human-like artificial agents is becoming increasingly important as humanoid robots grow more capable. However, most existing approaches focus only on reproducing kinematics and ignore the rich sensorimotor experience associated with human movement. In this work, we present a framework for simulating the multimodal sensorimotor experiences of infants using physical and virtual humanoids. From a single video, our method reconstructs the infant's body configuration by extracting its skeletal structure and estimating the full 3D pose from each frame. Then we map the reconstructed motion onto several developmental platforms: the physical iCub robot and the virtual simulators pyCub, EMFANT and MIMo. Replaying the retargeted motions on these embodiments produces simulated multisensory streams including proprioception (joints and muscles), touch, and vision. For the best-matching embodiment, the retargeting achieves sub-centimeter accuracy and enables a rich multimodal analysis of infant development as well as enhanced automated annotation of behaviors. This framework provides a unique window into the infant's sensorimotor experience, offering new tools for robotics, developmental science, and early detection of neurodevelopmental disorders. The code is available at https://github.com/ctu-vras/motion-retargeting/.