GIST: Group Interaction Sensing Toolkit for Mixed Reality

📅 2025-07-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing mixed reality (MR) collaborative research relies on external devices or unimodal data, limiting real-time, deployable interaction awareness. This work proposes the first real-time, multimodal group interaction perception system leveraging only onboard MR headset sensors—speech, eye gaze, and spatial pose—eliminating the need for external cameras or offline annotation. By integrating temporal modeling with dynamic network analysis, the system automatically infers both static structural properties and transient behavioral patterns during collaboration, uncovering the coupling between behavioral dynamics and interaction network evolution. Evaluated with 48 participants organized into 12 four-person teams, the system accurately captures fine-grained collaborative transitions, demonstrating high effectiveness and practicality. This work establishes a scalable, lightweight sensing foundation for real-time collaborative support in MR environments.

Technology Category

Application Category

📝 Abstract
Understanding how teams coordinate, share work, and negotiate roles in immersive environments is critical for designing effective mixed-reality (MR) applications that support real-time collaboration. However, existing methods either rely on external cameras and offline annotation or focus narrowly on single modalities, limiting their validity and applicability. To address this, we present a novel group interaction sensing toolkit (GIST), a deployable system that passively captures multi-modal interaction data, such as speech, gaze, and spatial proximity from commodity MR headset's sensors and automatically derives both overall static interaction networks and dynamic moment-by-moment behavior patterns. We evaluate GIST with a human subject study with 48 participants across 12 four-person groups performing an open-ended image-sorting task in MR. Our analysis shows strong alignment between the identified behavior modes and shifts in interaction network structure, confirming that momentary changes in speech, gaze, and proximity data are observable through the sensor data.
Problem

Research questions and friction points this paper is trying to address.

Understand team coordination in mixed-reality environments
Overcome limitations of single-modality sensing methods
Capture multi-modal interaction data for dynamic behavior analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Passively captures multi-modal interaction data
Uses commodity MR headset's sensors
Derives static and dynamic behavior patterns
🔎 Similar Papers
No similar papers found.