🤖 AI Summary
The original Minecraft Dialogue Corpus (MDC) lacks expert annotations for coreference and deixis, limiting its utility in studying dynamic, task-oriented, multi-turn situated dialogue. To address this, we introduce MDC-R—the first extended MDC variant with fine-grained, human-annotated coreference and deictic expressions across all dialogues. Annotation quality is rigorously validated through quantitative inter-annotator agreement metrics and qualitative error analysis. We further design targeted coreference and deictic expression understanding experiments to empirically verify the corpus’s utility. MDC-R constitutes the first publicly available, expert-annotated resource for reference resolution in Minecraft-based situated dialogue. Experimental results demonstrate that models trained or evaluated on MDC-R achieve significantly improved performance in resolving complex anaphoric and deictic references within multi-turn interactions, thereby providing critical linguistic grounding for situated dialogue systems.
📝 Abstract
We introduce the Minecraft Dialogue Corpus with Reference (MDC-R). MDC-R is a new language resource that supplements the original Minecraft Dialogue Corpus (MDC) with expert annotations of anaphoric and deictic reference. MDC's task-orientated, multi-turn, situated dialogue in a dynamic environment has motivated multiple annotation efforts, owing to the interesting linguistic phenomena that this setting gives rise to. We believe it can serve as a valuable resource when annotated with reference, too. Here, we discuss our method of annotation and the resulting corpus, and provide both a quantitative and a qualitative analysis of the data. Furthermore, we carry out a short experiment demonstrating the usefulness of our corpus for referring expression comprehension.