🤖 AI Summary
Addressing the challenges of achieving general-purpose whole-body control for humanoid robots across diverse locomotion tasks—and the poor generalizability of task-specific policies—this paper proposes BumbleBee, an expert-generalist collaborative framework. BumbleBee innovatively integrates motion-semantic clustering (via autoencoder-based joint embedding of features and natural-language descriptions), iterative delta-action residual modeling, and multi-expert knowledge distillation, augmented by simulation pretraining followed by iterative fine-tuning on real-world data. This design effectively mitigates control objective conflicts and reduces sim-to-real distribution shift. Evaluated on two high-fidelity simulation platforms and a physical humanoid robot, BumbleBee demonstrates substantial improvements in agility, robustness, and cross-behavior generalization, establishing new state-of-the-art performance in general-purpose whole-body control.
📝 Abstract
Achieving general agile whole-body control on humanoid robots remains a major challenge due to diverse motion demands and data conflicts. While existing frameworks excel in training single motion-specific policies, they struggle to generalize across highly varied behaviors due to conflicting control requirements and mismatched data distributions. In this work, we propose BumbleBee (BB), an expert-generalist learning framework that combines motion clustering and sim-to-real adaptation to overcome these challenges. BB first leverages an autoencoder-based clustering method to group behaviorally similar motions using motion features and motion descriptions. Expert policies are then trained within each cluster and refined with real-world data through iterative delta action modeling to bridge the sim-to-real gap. Finally, these experts are distilled into a unified generalist controller that preserves agility and robustness across all motion types. Experiments on two simulations and a real humanoid robot demonstrate that BB achieves state-of-the-art general whole-body control, setting a new benchmark for agile, robust, and generalizable humanoid performance in the real world.