ReStory: VLM-augmentation of Social Human-Robot Interaction Datasets

📅 2024-12-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Real-world human-robot interaction (HRI) data is inherently scarce, expensive to collect, and difficult to scale. To address this, we propose the first vision-language model (VLM)-based HRI data augmentation method, grounded in Ethnomethodologically-Informed Conversation Analysis (EMCA) theory to construct human-interpretable interaction narratives. Our approach enables human-AI co-creation of semantically coherent and contextually plausible storyboards. It integrates state-of-the-art VLMs (e.g., CLIP, LLaVA), a dedicated storyboard generation framework, and a lightweight annotation protocol—requiring minimal human supervision—to substantially enhance diversity and scenario coverage of small-scale, in-the-wild HRI datasets. Experiments demonstrate strong generalization across diverse robot morphologies and multimodal interaction modalities. This work establishes a scalable, principled foundation for HRI interaction design and downstream model training.

Technology Category

Application Category

📝 Abstract
Internet-scaled datasets are a luxury for human-robot interaction (HRI) researchers, as collecting natural interaction data in the wild is time-consuming and logistically challenging. The problem is exacerbated by robots' different form factors and interaction modalities. Inspired by recent work on ethnomethodological and conversation analysis (EMCA) in the domain of HRI, we propose ReStory, a method that has the potential to augment existing in-the-wild human-robot interaction datasets leveraging Vision Language Models. While still requiring human supervision, ReStory is capable of synthesizing human-interpretable interaction scenarios in the form of storyboards. We hope our proposed approach provides HRI researchers and interaction designers with a new angle to utilizing their valuable and scarce data.
Problem

Research questions and friction points this paper is trying to address.

Human-Robot Interaction
Data Collection
Real-World Scenarios
Innovation

Methods, ideas, or system contributions that make the work stand out.

ReStory Method
Data Enrichment
Human-Computer Interaction
🔎 Similar Papers
No similar papers found.