🤖 AI Summary
This work proposes a hybrid active human-in-the-loop annotation framework to efficiently construct action-centric human-object interaction (HOI) event graphs for robotic learning from human demonstrations. The approach formulates HOI event construction as an incremental parsing process grounded in partially specified, anchor-initiated event states. It integrates a trust-calibrated controller, a risk-bounded execution protocol, and an atomic rollback mechanism to dynamically optimize the interaction strategy while preserving the integrity of confirmed annotations. Experimental results demonstrate that the method reduces manual annotation effort by 13.5%, achieves an event matching rate of 46.67%, and ensures zero violations of previously confirmed fields, thereby significantly enhancing both the efficiency and reliability of structured supervision acquisition.
📝 Abstract
We present IMPACT-HOI, a mixed-initiative framework for annotating egocentric procedural video by constructing structured event graphs for Human-Object Interactions (HOI), motivated by the need for high-quality structured supervision for learning robot manipulation from human demonstration. IMPACT-HOI frames this task as the incremental resolution of a partially specified, onset-anchored event state. A trust-calibrated controller selects among direct queries, human-confirmed suggestions, and conservative completions based on empirical annotator behavior and evidence quality. A risk-bounded execution protocol, utilizing atomic rollback, ensures that human-confirmed decisions are preserved against conflicting automated updates. A user study with 9 participants shows a 13.5% reduction in manual annotation actions, a 46.67% event match rate, and zero confirmed-field violations under the studied protocol. The code will be made publicly available at https://github.com/541741106/IMPACT_HOI.