🤖 AI Summary
To address insufficient behavioral similarity, appropriateness, and structural adaptability in cross-body-structure behavior transfer for humanoid robots, this paper proposes a two-layer closed-loop generative framework. At the semantic level, the upper layer performs context-aware modeling and fine-grained behavioral annotation for action planning; at the physical level, the lower layer ensures execution consistency via skeletal scaling-based data augmentation and millimeter-precision pose alignment. To bridge the structural gap between motion generation and heterogeneous platform execution, we introduce the context-rich HPose dataset and a novel bone-scaling augmentation strategy. Extensive evaluation across multiple commercial humanoid platforms demonstrates significant improvements: +23.6% in motion similarity, +18.4% in behavioral appropriateness, and 2.1× acceleration in inference latency. To our knowledge, this is the first approach enabling high-fidelity, generalizable, and low-latency cross-morphology humanoid behavior execution.
📝 Abstract
Achieving both behavioral similarity and appropriateness in human-like motion generation for humanoid robot remains an open challenge, further compounded by the lack of cross-embodiment adaptability. To address this problem, we propose HuBE, a bi-level closed-loop framework that integrates robot state, goal poses, and contextual situations to generate human-like behaviors, ensuring both behavioral similarity and appropriateness, and eliminating structural mismatches between motion generation and execution. To support this framework, we construct HPose, a context-enriched dataset featuring fine-grained situational annotations. Furthermore, we introduce a bone scaling-based data augmentation strategy that ensures millimeter-level compatibility across heterogeneous humanoid robots. Comprehensive evaluations on multiple commercial platforms demonstrate that HuBE significantly improves motion similarity, behavioral appropriateness, and computational efficiency over state-of-the-art baselines, establishing a solid foundation for transferable and human-like behavior execution across diverse humanoid robots.