🤖 AI Summary
To address the high computational cost and poor real-time deployability of human motion prediction models in robotic handover tasks, this paper proposes IntentMotion—a novel intent-aware conditional modeling framework. It introduces a task-specific multi-objective loss function and a lightweight intent classifier, built upon the siMLPe backbone to jointly optimize contextual encoding and intent recognition. Compared with state-of-the-art methods, IntentMotion reduces keypoint prediction error by over 50%, accelerates inference speed by 200×, and shrinks model parameters to merely 3%. These improvements uniquely balance accuracy and efficiency, enabling—for the first time—high-accuracy, low-latency, and embeddable human–robot collaborative motion prediction in real-world scenarios.
📝 Abstract
Accurate human motion prediction (HMP) is critical for seamless human-robot collaboration, particularly in handover tasks that require real-time adaptability. Despite the high accuracy of state-of-the-art models, their computational complexity limits practical deployment in real-world robotic applications. In this work, we enhance human motion forecasting for handover tasks by leveraging siMLPe [1], a lightweight yet powerful architecture, and introducing key improvements. Our approach, named IntentMotion incorporates intention-aware conditioning, task-specific loss functions, and a novel intention classifier, significantly improving motion prediction accuracy while maintaining efficiency. Experimental results demonstrate that our method reduces body loss error by over 50%, achieves 200x faster inference, and requires only 3% of the parameters compared to existing state-of-the-art HMP models. These advancements establish our framework as a highly efficient and scalable solution for real-time human-robot interaction.