🤖 AI Summary
This work addresses the challenge in human-robot collaboration where existing approaches struggle to jointly model continuous human motion and discrete behavioral intentions, often resulting in inconsistent or inaccurate predictions. To overcome this limitation, the paper introduces the MA-HERP framework, which uniquely incorporates Allen’s interval algebra into a hierarchical action-motion representation. By leveraging a hierarchical recursive probabilistic graphical model, MA-HERP coherently couples continuous kinematic trajectories, discrete action labels, and their temporal durations. The framework integrates top-down intention reasoning with bottom-up perceptual evidence and employs Bayesian recursive filtering to enable efficient online inference. Experimental results demonstrate that the system achieves high-fidelity motion prediction, robust action recognition, and computational efficiency sufficient for real-time collaborative tasks on musculoskeletal simulation data.
📝 Abstract
Fluent human--robot collaboration requires robots to continuously estimate human behaviour and anticipate future intentions. This entails reasoning jointly about \emph{continuous movements} and \emph{discrete actions}, which are still largely modelled in isolation. In this paper, we introduce \textsf{MA-HERP}, a hierarchical and recursive probabilistic framework for the \emph{joint estimation and prediction} of human movements and actions. The model combines: (i) a hierarchical representation in which movements compose into actions through admissible Allen interval relations, (ii) a unified probabilistic factorisation coupling continuous dynamics, discrete labels, and durations, and (iii) a recursive inference scheme inspired by Bayesian filtering, alternating top-down action prediction with bottom-up sensory evidence. We present a preliminary experimental evaluation based on neural models trained on musculoskeletal simulations of reaching movements, showing accurate motion prediction, robust action inference under noise, and computational performance compatible with on-line human--robot collaboration.