🤖 AI Summary
Dynamic alignment of shared mental models (SMMs) remains challenging in human–machine teaming (HMT). Method: We propose a lightweight, interpretable real-time collaboration verification framework. We develop a browser-based testbed in Minecraft, integrated with WebRTC for low-latency audio-video synchronization and a first-person multi-source recording system, enabling continuous, partially observable human–AI interaction. Post-task, we deploy a GPT-4–driven debriefing tool supporting dual-perspective replay and natural-language, question-answering–style explanations of AI behavior. Contribution/Results: This work introduces the first lightweight Minecraft testbed and an explainable debriefing paradigm tailored to HMT. Empirical evaluation shows deployment time ≤5 minutes; user accuracy in inferring AI intent improves by 37% (n=42), significantly lowering barriers to HMT research.
📝 Abstract
In this work, we present two novel contributions toward improving research in human-machine teaming (HMT): 1) a Minecraft testbed to accelerate testing and deployment of collaborative AI agents and 2) a tool to allow users to revisit and analyze behaviors within an HMT episode to facilitate shared mental model development. Our browser-based Minecraft testbed allows for rapid testing of collaborative agents in a continuous-space, real-time, partially-observable environment with real humans without cumbersome setup typical to human-AI interaction user studies. As Minecraft has an extensive player base and a rich ecosystem of pre-built AI agents, we hope this contribution can help to facilitate research quickly in the design of new collaborative agents and in understanding different human factors within HMT. Our mental model alignment tool facilitates user-led post-mission analysis by including video displays of first-person perspectives of the team members (i.e., the human and AI) that can be replayed, and a chat interface that leverages GPT-4 to provide answers to various queries regarding the AI's experiences and model details.