Pragmatic Frames Evoked by Gestures: A FrameNet Brasil Approach to Multimodality in Turn Organization

📅 2025-09-11

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

Multimodal dialogue systems lack robust modeling of how linguistic and gestural cues jointly regulate turn-taking, particularly due to the absence of systematic, functionally grounded gesture annotations. Method: We introduce the first turn-function-oriented gesture annotation schema on the real-world Frame2 corpus, uncovering previously undocumented gesture variants and their cognitive motivations. Drawing on FrameNet Brasil, we integrate mental spaces theory, conceptual blending, and metaphor theory to develop a dual-layer semantic–pragmatic annotation framework. Contribution/Results: Our analysis confirms that co-speech gestures serve as critical pragmatic turn-taking cues. The resulting annotation scheme significantly enhances both cognitive interpretability and computational utility for multimodal dialogue structure modeling. It establishes a novel paradigm for modeling human interactive cognition and advancing multimodal human–machine interaction.

Technology Category

Application Category

📝 Abstract

This paper proposes a framework for modeling multimodal conversational turn organization via the proposition of correlations between language and interactive gestures, based on analysis as to how pragmatic frames are conceptualized and evoked by communicators. As a means to provide evidence for the analysis, we developed an annotation methodology to enrich a multimodal dataset (annotated for semantic frames) with pragmatic frames modeling conversational turn organization. Although conversational turn organization has been studied by researchers from diverse fields, the specific strategies, especially gestures used by communicators, had not yet been encoded in a dataset that can be used for machine learning. To fill this gap, we enriched the Frame2 dataset with annotations of gestures used for turn organization. The Frame2 dataset features 10 episodes from the Brazilian TV series Pedro Pelo Mundo annotated for semantic frames evoked in both video and text. This dataset allowed us to closely observe how communicators use interactive gestures outside a laboratory, in settings, to our knowledge, not previously recorded in related literature. Our results have confirmed that communicators involved in face-to-face conversation make use of gestures as a tool for passing, taking and keeping conversational turns, and also revealed variations of some gestures that had not been documented before. We propose that the use of these gestures arises from the conceptualization of pragmatic frames, involving mental spaces, blending and conceptual metaphors. In addition, our data demonstrate that the annotation of pragmatic frames contributes to a deeper understanding of human cognition and language.

Problem

Research questions and friction points this paper is trying to address.

Modeling multimodal turn organization via language-gesture correlations

Developing annotation methodology for pragmatic frames in conversations

Enriching dataset with gestures for turn-taking machine learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal annotation methodology for gestures

Enriched dataset with pragmatic frame annotations

Correlation analysis between gestures and language

🔎 Similar Papers

No similar papers found.