Hybrid-supervised Hypergraph-enhanced Transformer for Micro-gesture Based Emotion Recognition

📅 2025-07-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Prior work on micro-gesture emotion recognition remains limited, particularly in modeling fine-grained affective dynamics from skeletal sequences. Method: This paper proposes a hypergraph-enhanced Transformer framework—the first to introduce hypergraph modeling for skeleton-based micro-emotion analysis. It features a hypergraph self-attention module with progressively updated hyperedges to explicitly capture high-order, time-varying joint interactions; integrates multi-scale temporal convolutions and a self-supervised reconstruction decoder to precisely encode subtle motion patterns of micro-gestures; and enables end-to-end joint optimization of the emotion classification head and reconstruction task within the encoder. Results: Evaluated on iMiGUE and SMG benchmarks, our method achieves state-of-the-art performance, significantly outperforming existing approaches in accuracy, macro-F1, and other key metrics—demonstrating the efficacy of hypergraph structures for modeling micro-expression-level emotional states.

Technology Category

Application Category

📝 Abstract
Micro-gestures are unconsciously performed body gestures that can convey the emotion states of humans and start to attract more research attention in the fields of human behavior understanding and affective computing as an emerging topic. However, the modeling of human emotion based on micro-gestures has not been explored sufficiently. In this work, we propose to recognize the emotion states based on the micro-gestures by reconstructing the behavior patterns with a hypergraph-enhanced Transformer in a hybrid-supervised framework. In the framework, hypergraph Transformer based encoder and decoder are separately designed by stacking the hypergraph-enhanced self-attention and multiscale temporal convolution modules. Especially, to better capture the subtle motion of micro-gestures, we construct a decoder with additional upsampling operations for a reconstruction task in a self-supervised learning manner. We further propose a hypergraph-enhanced self-attention module where the hyperedges between skeleton joints are gradually updated to present the relationships of body joints for modeling the subtle local motion. Lastly, for exploiting the relationship between the emotion states and local motion of micro-gestures, an emotion recognition head from the output of encoder is designed with a shallow architecture and learned in a supervised way. The end-to-end framework is jointly trained in a one-stage way by comprehensively utilizing self-reconstruction and supervision information. The proposed method is evaluated on two publicly available datasets, namely iMiGUE and SMG, and achieves the best performance under multiple metrics, which is superior to the existing methods.
Problem

Research questions and friction points this paper is trying to address.

Recognize emotion states using micro-gestures via hybrid-supervised framework
Model subtle local motion with hypergraph-enhanced self-attention modules
Improve performance on micro-gesture datasets iMiGUE and SMG
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid-supervised hypergraph-enhanced Transformer framework
Self-supervised decoder with upsampling for motion capture
Hypergraph self-attention for subtle joint motion modeling
🔎 Similar Papers
No similar papers found.
Zhaoqiang Xia
Zhaoqiang Xia
Northwestern Polytechnical University
Visual ComputingInformation Processing
H
Hexiang Huang
School of Electronics and Information, Northwestern Polytechnical University, Xi'an, Shaanxi 710072, China
H
Haoyu Chen
University of Oulu
X
Xiaoyi Feng
School of Electronics and Information, Northwestern Polytechnical University, Xi'an, Shaanxi 710072, China
Guoying Zhao
Guoying Zhao
Academy Professor, IEEE Fellow, Professor of Computer Science and Engineering, University of Oulu
Affective ComputingArtificial IntelligenceComputer VisionPattern Recognition