🤖 AI Summary
This work addresses the challenges of modeling user intent in collaborative physical activities, coupled with opaque and uncorrectable system behavior. We propose an editable program synthesis framework grounded in narrative demonstrations—paired action sequences and natural language descriptions. Our method formalizes collaborative task learning as a multimodal program synthesis problem, representing collaboration logic via editable code structures and unifying natural language for both instruction and feedback, thereby enabling semantic-level behavioral alignment and interactive program correction. Key contributions include: (1) the first formulation of collaborative physical behavior as human-readable, editable programs; and (2) realization of a natural language–driven end-to-end learning–explanation–correction loop. A 20-participant user study demonstrates that 70% successfully edited synthesized programs to match their intent, and 90% rated the correction process as intuitive and usable—validating the framework’s effectiveness in explainability, controllability, and usability.
📝 Abstract
Teaching systems physical tasks is a long standing goal in HCI, yet most prior work has focused on non collaborative physical activities. Collaborative tasks introduce added complexity, requiring systems to infer users assumptions about their teammates intent, which is an inherently ambiguous and dynamic process. This necessitates representations that are interpretable and correctable, enabling users to inspect and refine system behavior. We address this challenge by framing collaborative task learning as a program synthesis problem. Our system represents behavior as editable programs and uses narrated demonstrations, i.e. paired physical actions and natural language, as a unified modality for teaching, inspecting, and correcting system logic without requiring users to see or write code. The same modality is used for the system to communicate its learning to users. In a within subjects study, 20 users taught multiplayer soccer tactics to our system. 70 percent (14/20) of participants successfully refined learned programs to match their intent and 90 percent (18/20) found it easy to correct the programs. The study surfaced unique challenges in representing learning as programs and in enabling users to teach collaborative physical activities. We discuss these issues and outline mitigation strategies.