🤖 AI Summary
This work addresses the challenge of accurately inferring human task-level intent from brief physical corrections and achieving semantic adaptation in human-robot collaboration. The authors propose TATIC, a unified framework that integrates task-level semantic intent inference with physical interaction for the first time. By combining torque-based contact force estimation with a task-aware temporal convolutional network (TCN), the framework jointly infers discrete task intentions and continuous motion parameters. Task-aligned feature normalization is introduced to enhance cross-scenario generalization, and an intent-driven motion adaptation mechanism enables end-to-end mapping from physical feedback to task-level adaptation. Experimental results demonstrate a Macro-F1 score of 0.904 in intent recognition, and the approach is validated on a real-world collaborative disassembly task using physical hardware.
📝 Abstract
In human-robot collaboration (HRC), robots must adapt online to dynamic task constraints and evolving human intent. While physical corrections provide a natural, low-latency channel for operators to convey motion-level adjustments, extracting task-level semantic intent from such brief interactions remains challenging. Existing foundation-model-based approaches primarily rely on vision and language inputs and lack mechanisms to interpret physical feedback. Meanwhile, traditional physical human-robot interaction (pHRI) methods leverage physical corrections for trajectory guidance but struggle to infer task-level semantics. To bridge this gap, we propose TATIC, a unified framework that utilizes torque-based contact force estimation and a task-aware Temporal Convolutional Network (TCN) to jointly infer discrete task-level intent and estimate continuous motion-level parameters from brief physical corrections. Task-aligned feature canonicalization ensures robust generalization across diverse layouts, while an intent-driven adaptation scheme translates inferred human intent into robot motion adaptations. Experiments achieve a 0.904 Macro-F1 score in intent recognition and demonstrate successful hardware validation in collaborative disassembly (see experimental video at https://youtu.be/xF8A52qwEc8).