π€ AI Summary
This work addresses the critical scarcity of large-scale, real-world human-robot interaction data that severely limits robotsβ ability to learn robust behaviors in practical settings. To overcome this challenge, the authors propose a generative simulation framework termed βtext2sim2real,β which establishes the first end-to-end pipeline for physically grounded human-robot interaction. Leveraging large language models and vision-language models, the framework programmatically generates soft-body human avatars, scene layouts, and robot trajectories directly from natural language instructions. Visual imitation policies trained solely on synthetic point cloud data generated by this pipeline achieve zero-shot sim-to-real transfer without any real interaction data. The approach demonstrates over 80% success rates on tasks such as scratching and bathing, and exhibits strong generalization to unseen human motions, validating its effectiveness and robustness.
π Abstract
Developing autonomous physical human-robot interaction (pHRI) systems is limited by the scarcity of large-scale training data to learn robust robot behaviors for real-world applications. In this paper, we introduce a zero-shot"text2sim2real"generative simulation framework that automatically synthesizes diverse pHRI scenarios from high-level natural-language prompts. Leveraging Large Language Models (LLMs) and Vision-Language Models (VLMs), our pipeline procedurally generates soft-body human models, scene layouts, and robot motion trajectories for assistive tasks. We utilize this framework to autonomously collect large-scale synthetic demonstration datasets and then train vision-based imitation learning policies operating on segmented point clouds. We evaluate our approach through a user study on two physically assistive tasks: scratching and bathing. Our learned policies successfully achieve zero-shot sim-to-real transfer, attaining success rates exceeding 80% and demonstrating resilience to unscripted human motion. Overall, we introduce the first generative simulation pipeline for pHRI applications, automating simulation environment synthesis, data collection, and policy learning. Additional information may be found on our project website: https://rchi-lab.github.io/gen_phri/