Predicting States of Understanding in Explanatory Interactions Using Cognitive Load-Related Linguistic Cues

📅 2026-03-20

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

This study addresses the real-time prediction of listeners’ comprehension states—namely, understanding, partial understanding, non-understanding, and misunderstanding—in explanatory dialogues. The authors propose a cognitive modeling approach that integrates linguistic and non-linguistic cues, uniquely combining three types of cognitive load–related features: information value, syntactic complexity, and interactive gaze behavior. Leveraging the MUNDEX corpus, they employ both statistical analysis and machine learning methods, including general-purpose classifiers and a fine-tuned German BERT-based multimodal model. Experimental results demonstrate that all three feature categories are significantly associated with comprehension states, and their integration consistently improves prediction accuracy across the four-state classification task. These findings validate the efficacy of multidimensional cognitive load indicators for fine-grained modeling of listener understanding in dialogue.

Technology Category

Application Category

📝 Abstract

We investigate how verbal and nonverbal linguistic features, exhibited by speakers and listeners in dialogue, can contribute to predicting the listener's state of understanding in explanatory interactions on a moment-by-moment basis. Specifically, we examine three linguistic cues related to cognitive load and hypothesised to correlate with listener understanding: the information value (operationalised with surprisal) and syntactic complexity of the speaker's utterances, and the variation in the listener's interactive gaze behaviour. Based on statistical analyses of the MUNDEX corpus of face-to-face dialogic board game explanations, we find that individual cues vary with the listener's level of understanding. Listener states ('Understanding', 'Partial Understanding', 'Non-Understanding' and 'Misunderstanding') were self-annotated by the listeners using a retrospective video-recall method. The results of a subsequent classification experiment, involving two off-the-shelf classifiers and a fine-tuned German BERT-based multimodal classifier, demonstrate that prediction of these four states of understanding is generally possible and improves when the three linguistic cues are considered alongside textual features.

Problem

Research questions and friction points this paper is trying to address.

state of understanding

explanatory interactions

cognitive load

linguistic cues

dialogue

Innovation

Methods, ideas, or system contributions that make the work stand out.

cognitive load

linguistic cues

state of understanding