Characterizing Language Use in a Collaborative Situated Game

📅 2025-12-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the lack of systematic corpora for language phenomena unique to collaborative video games—such as complex spatial deixis, discourse repair, and emergent convention formation. We collected 11.5 hours of natural spoken dialogue from players engaged in the cooperative mode of *Portal 2*, constructing the Portal Dialogue Corpus comprising 24.5K utterances. It is the first corpus to provide fine-grained, multimodal manual annotation—including spatial references, repair behaviors, and negotiation-oriented linguistic structures—synchronized with game logs, audiovisual recordings, and ASR transcripts. The publicly released corpus includes temporally aligned transcriptions, real-time game state traces, and hierarchical linguistic annotations. This resource fills a critical gap in natural language interaction data for highly dynamic, tightly coupled collaborative settings and establishes the first high-quality, reproducible benchmark for modeling real-time linguistic coordination mechanisms.

Technology Category

Application Category

📝 Abstract
Cooperative video games, where multiple participants must coordinate by communicating and reasoning under uncertainty in complex environments, yield a rich source of language data. We collect the Portal Dialogue Corpus: a corpus of 11.5 hours of spoken human dialogue in the co-op mode of the popular Portal 2 virtual puzzle game, comprising 24.5K total utterances. We analyze player language and behavior, identifying a number of linguistic phenomena that rarely appear in most existing chitchat or task-oriented dialogue corpora, including complex spatial reference, clarification and repair, and ad-hoc convention formation. To support future analyses of language use in complex, situated, collaborative problem-solving scenarios, we publicly release the corpus, which comprises player videos, audio, transcripts, game state data, and both manual and automatic annotations of language data.
Problem

Research questions and friction points this paper is trying to address.

Characterizes language use in collaborative video games
Analyzes linguistic phenomena in complex spatial references
Releases corpus for studying situated collaborative problem-solving
Innovation

Methods, ideas, or system contributions that make the work stand out.

Collects spoken dialogue corpus from cooperative video games
Analyzes linguistic phenomena in collaborative problem-solving scenarios
Releases multimodal corpus with manual and automatic annotations
🔎 Similar Papers
No similar papers found.