🤖 AI Summary
High cost, low efficiency, and hardware limitations of teleoperation hinder scalable human motion data collection in unstructured field environments. Method: We propose AirExo-2, a lightweight full-body exoskeleton enabling low-cost, extensible field motion capture, and introduce the first end-to-end “exoskeleton capture → pseudo-robot demonstration” pipeline via kinematic mapping and pseudo-demonstration generation. Further, we design RISE-2, a multimodal imitation learning framework integrating 2D/3D perception with few-shot cross-domain adaptation. Contribution/Results: Evaluated solely on field-collected human demonstrations, RISE-2 matches or surpasses conventional teleoperation-based training in both in-domain and cross-domain tasks—demonstrating superior generalization and robustness. This establishes a novel paradigm for embodied intelligence training without physical robot involvement.
📝 Abstract
Scaling up imitation learning for real-world applications requires efficient and cost-effective demonstration collection methods. Current teleoperation approaches, though effective, are expensive and inefficient due to the dependency on physical robot platforms. Alternative data sources like in-the-wild demonstrations can eliminate the need for physical robots and offer more scalable solutions. However, existing in-the-wild data collection devices have limitations: handheld devices offer restricted in-hand camera observation, while whole-body devices often require fine-tuning with robot data due to action inaccuracies. In this paper, we propose AirExo-2, a low-cost exoskeleton system for large-scale in-the-wild demonstration collection. By introducing the demonstration adaptor to transform the collected in-the-wild demonstrations into pseudo-robot demonstrations, our system addresses key challenges in utilizing in-the-wild demonstrations for downstream imitation learning in real-world environments. Additionally, we present RISE-2, a generalizable policy that integrates 2D and 3D perceptions, outperforming previous imitation learning policies in both in-domain and out-of-domain tasks, even with limited demonstrations. By leveraging in-the-wild demonstrations collected and transformed by the AirExo-2 system, without the need for additional robot demonstrations, RISE-2 achieves comparable or superior performance to policies trained with teleoperated data, highlighting the potential of AirExo-2 for scalable and generalizable imitation learning. Project page: https://airexo.tech/airexo2