Interactive Program Synthesis for Modeling Collaborative Physical Activities from Narrated Demonstrations

📅 2025-09-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of modeling user intent in collaborative physical activities, coupled with opaque and uncorrectable system behavior. We propose an editable program synthesis framework grounded in narrative demonstrations—paired action sequences and natural language descriptions. Our method formalizes collaborative task learning as a multimodal program synthesis problem, representing collaboration logic via editable code structures and unifying natural language for both instruction and feedback, thereby enabling semantic-level behavioral alignment and interactive program correction. Key contributions include: (1) the first formulation of collaborative physical behavior as human-readable, editable programs; and (2) realization of a natural language–driven end-to-end learning–explanation–correction loop. A 20-participant user study demonstrates that 70% successfully edited synthesized programs to match their intent, and 90% rated the correction process as intuitive and usable—validating the framework’s effectiveness in explainability, controllability, and usability.

Technology Category

Application Category

📝 Abstract
Teaching systems physical tasks is a long standing goal in HCI, yet most prior work has focused on non collaborative physical activities. Collaborative tasks introduce added complexity, requiring systems to infer users assumptions about their teammates intent, which is an inherently ambiguous and dynamic process. This necessitates representations that are interpretable and correctable, enabling users to inspect and refine system behavior. We address this challenge by framing collaborative task learning as a program synthesis problem. Our system represents behavior as editable programs and uses narrated demonstrations, i.e. paired physical actions and natural language, as a unified modality for teaching, inspecting, and correcting system logic without requiring users to see or write code. The same modality is used for the system to communicate its learning to users. In a within subjects study, 20 users taught multiplayer soccer tactics to our system. 70 percent (14/20) of participants successfully refined learned programs to match their intent and 90 percent (18/20) found it easy to correct the programs. The study surfaced unique challenges in representing learning as programs and in enabling users to teach collaborative physical activities. We discuss these issues and outline mitigation strategies.
Problem

Research questions and friction points this paper is trying to address.

Modeling collaborative physical activities from narrated demonstrations
Inferring user assumptions about teammate intent in tasks
Creating interpretable and correctable program representations for systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Program synthesis for collaborative task learning
Editable programs from narrated demonstrations
Unified modality for teaching and correcting logic
🔎 Similar Papers
No similar papers found.
E
Edward Kim
University of California, Berkeley
D
Daniel He
University of California, Berkeley
J
Jorge Chao
University of California, Berkeley
W
Wiktor Rajca
University of California, Berkeley
M
Mohammed Amin
University of California, Berkeley
N
Nishant Malpani
University of California, Berkeley
Ruta Desai
Ruta Desai
FAIR (Meta AI)
Embodied AIComputer VisionHuman-Computer Interaction
Antti Oulasvirta
Antti Oulasvirta
Professor, Aalto University
Human-computer interactioncomputational modeling of behavior
Bjoern Hartmann
Bjoern Hartmann
Associate Professor of EECS, University of California, Berkeley
Human-Computer Interaction
S
Sanjit Seshia
University of California, Berkeley