π€ AI Summary
This work addresses the challenge of motion retargeting across heterogeneous robots, which often lack a unified and generalizable solution due to significant differences in kinematics and dynamics. To overcome this limitation, the authors propose a morphology-agnostic conditional generative framework that leverages a latent intention space, augmented with a morphology-aware prompting mechanism and an adaptive layer normalization (AdaLN)-based dynamic modulation decoder. This design enables a single model to accommodate diverse robotic morphologies without requiring separate training for each platform. The approach supports zero-shot transfer to unseen, complex motions while preserving the dynamic characteristics of the source motion. Experimental validation on twelve distinct humanoid robots demonstrates the methodβs effectiveness in achieving high-fidelity cross-morphology motion transfer and unified control.
π Abstract
Retargeting human motion to heterogeneous robots is a fundamental challenge in robotics, primarily due to the severe kinematic and dynamic discrepancies between varying embodiments. Existing solutions typically resort to training embodiment-specific models, which scales poorly and fails to exploit shared motion semantics. To address this, we present AdaMorph, a unified neural retargeting framework that enables a single model to adapt human motion to diverse robot morphologies. Our approach treats retargeting as a conditional generation task. We map human motion into a morphology-agnostic latent intent space and utilize a dual-purpose prompting mechanism to condition the generation. Instead of simple input concatenation, we leverage Adaptive Layer Normalization (AdaLN) to dynamically modulate the decoder's feature space based on embodiment constraints. Furthermore, we enforce physical plausibility through a curriculum-based training objective that ensures orientation and trajectory consistency via integration. Experimental results on 12 distinct humanoid robots demonstrate that AdaMorph effectively unifies control across heterogeneous topologies, exhibiting strong zero-shot generalization to unseen complex motions while preserving the dynamic essence of the source behaviors.