Conveying Meaning through Gestures: An Investigation into Semantic Co-Speech Gesture Generation

📅 2025-10-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates the impact of semantic enhancement on co-speech gesture generation quality and human perception. We propose two frameworks—AQ-GT (baseline) and AQ-GT-a (explicit semantic enhancement variant)—trained on the SAGA corpus, and conduct user-centered evaluations along two dimensions: concept recognition accuracy and anthropomorphism. Results reveal that explicit semantic enhancement does not universally improve performance: AQ-GT achieves superior in-domain concept conveyance, whereas AQ-GT-a, though not significantly enhancing anthropomorphism, demonstrates greater expressiveness, helpfulness, and cross-scenario generalization—particularly in representing shape and size. The core contribution lies in empirically uncovering the trade-off between semantic specialization and model generalization, thereby providing evidence for both the necessity and viable implementation strategies of semantic modeling in co-speech gesture generation.

Technology Category

Application Category

📝 Abstract
This study explores two frameworks for co-speech gesture generation, AQ-GT and its semantically-augmented variant AQ-GT-a, to evaluate their ability to convey meaning through gestures and how humans perceive the resulting movements. Using sentences from the SAGA spatial communication corpus, contextually similar sentences, and novel movement-focused sentences, we conducted a user-centered evaluation of concept recognition and human-likeness. Results revealed a nuanced relationship between semantic annotations and performance. The original AQ-GT framework, lacking explicit semantic input, was surprisingly more effective at conveying concepts within its training domain. Conversely, the AQ-GT-a framework demonstrated better generalization, particularly for representing shape and size in novel contexts. While participants rated gestures from AQ-GT-a as more expressive and helpful, they did not perceive them as more human-like. These findings suggest that explicit semantic enrichment does not guarantee improved gesture generation and that its effectiveness is highly dependent on the context, indicating a potential trade-off between specialization and generalization.
Problem

Research questions and friction points this paper is trying to address.

Evaluating gesture generation frameworks for conveying semantic meaning
Assessing human perception of concept recognition and human-likeness
Investigating trade-offs between semantic enrichment and generalization capability
Innovation

Methods, ideas, or system contributions that make the work stand out.

AQ-GT generates gestures without semantic input
AQ-GT-a adds semantic annotations for generalization
Semantic enrichment enables better shape representation
🔎 Similar Papers
No similar papers found.