🤖 AI Summary
Addressing three core challenges in physical human–humanoid interaction (pHHI) within unstructured human environments—whole-body control under human motion uncertainty, real-time intent inference with limited sensing, and time-varying modeling of human physical states—this paper proposes the first cross-domain unified framework. Methodologically, it establishes a tripartite foundation comprising interaction-aware modeling and control, intent estimation, and computational human modeling, organized via a taxonomy grounded in interaction modalities and participation levels. The framework integrates novel whole-body adaptive control strategies, low-latency intent reasoning algorithms, and variable-state human modeling techniques, all embedded within a dynamic response mechanism tailored for resource-constrained sensing. By systematically identifying current integration bottlenecks, it delivers a safety-guaranteed, robust, and intuitively interpretable pHHI roadmap. Results demonstrate significantly enhanced adaptive collaboration capability of humanoid robots in human-centered scenarios.
📝 Abstract
Physical Human-Humanoid Interaction (pHHI) is a rapidly advancing field with significant implications for deploying robots in unstructured, human-centric environments. In this review, we examine the current state of the art in pHHI through three core pillars: (i) humanoid modeling and control, (ii) human intent estimation, and (iii) computational human models. For each pillar, we survey representative approaches, identify open challenges, and analyze current limitations that hinder robust, scalable, and adaptive interaction. These include the need for whole-body control strategies capable of handling uncertain human dynamics, real-time intent inference under limited sensing, and modeling techniques that account for variability in human physical states. Although significant progress has been made within each domain, integration across pillars remains limited. We propose pathways for unifying methods across these areas to enable cohesive interaction frameworks. This structure enables us not only to map the current landscape but also to propose concrete directions for future research that aim to bridge these domains. Additionally, we introduce a unified taxonomy of interaction types based on modality, distinguishing between direct interactions (e.g., physical contact) and indirect interactions (e.g., object-mediated), and on the level of robot engagement, ranging from assistance to cooperation and collaboration. For each category in this taxonomy, we provide the three core pillars that highlight opportunities for cross-pillar unification. Our goal is to suggest avenues to advance robust, safe, and intuitive physical interaction, providing a roadmap for future research that will allow humanoid systems to effectively understand, anticipate, and collaborate with human partners in diverse real-world settings.