MEDN: Motion-Emotion Feature Decoupling Network for Micro-Expression Recognition

📅 2026-04-20
📈 Citations: 0
Influential: 0
📄 PDF

career value

213K/year
🤖 AI Summary
This work addresses the challenge in micro-expression recognition that identical facial action units (AUs) often correspond to different emotions, resulting in high visual similarity and the neglect of implicit emotional cues in existing approaches. To tackle this issue, the authors propose a dual-branch network: a motion branch guided by AU detection to extract explicit dynamic features, and an emotion branch incorporating a Sparse Emotion Vision Transformer (SEViT) to capture multi-scale local temporal emotional clues. The two types of features are explicitly disentangled via an orthogonality loss, and a Collaborative Fusion Module (CoFM) adaptively integrates them. This method is the first to achieve explicit disentanglement of motion and emotion features in micro-expressions, significantly outperforming state-of-the-art methods on three benchmark datasets and substantially improving both recognition accuracy and generalization capability.

Technology Category

Application Category

📝 Abstract
Unlike macro-expression, micro-expression does not follow a strictly consistent mapping rule between emotions and Action Units (AUs). As a result, some micro-expressions share identical AUs yet represent completely opposite emotional categories, making them highly visually similar. Existing microexpression recognition (MER) methods mostly rely on explicit facial motion cues (e.g., optical flow, frame differences, AU features) while ignoring implicit emotion information. To tackle this issue, this paper presents a Motion Emotion Feature Decoupling Network (MEDN) for MER. We design a dual-branch framework to separately extract motion and emotion features. In the motion branch, an AU-detection task restricts features to the explicit motion domain, and orthogonal loss is adopted to reduce motion emotion feature coupling. For implicit emotion modeling, we propose a Sparse Emotion Vision Transformer (SEVit) that sparsifies spatial tokens to highlight local temporal variations with multi-scale sparsity rates. A Collaborative Fusion Module (CoFM) is further developed to fuse disentangled motion and emotion features adaptively. Extensive experiments on three benchmark datasets validate that MEDN effectively decouples motion and emotion features and achieves superior recognition performance, offering a new perspective for enhancing recognition accuracy and generalization.
Problem

Research questions and friction points this paper is trying to address.

micro-expression recognition
emotion-action unit inconsistency
motion-emotion decoupling
visually similar micro-expressions
implicit emotion information
Innovation

Methods, ideas, or system contributions that make the work stand out.

feature decoupling
micro-expression recognition
Sparse Emotion Vision Transformer
orthogonal loss
Collaborative Fusion Module
🔎 Similar Papers
No similar papers found.
C
Chenxing Hu
School of Computer Science and Technology, Xidian University, Xi’an, Shaanxi 710071, China; Xi’an Key Laboratory of Big Data and Intelligent Vision, Xi’an, Shaanxi 710071, China; Key Laboratory of Collaborative Intelligence Systems, Ministry of Education, Xidian University, Xi’an, Shaanxi 710071, China
Kun Xie
Kun Xie
Associate Professor, Old Dominion University
Transportation SafetyTransportation ResilienceAIConnected and Autonomous Vehicles
Q
Qiguang Miao
School of Computer Science and Technology, Xidian University, Xi’an, Shaanxi 710071, China; Xi’an Key Laboratory of Big Data and Intelligent Vision, Xi’an, Shaanxi 710071, China; Key Laboratory of Collaborative Intelligence Systems, Ministry of Education, Xidian University, Xi’an, Shaanxi 710071, China
R
Ruyi Liu
School of Computer Science and Technology, Xidian University, Xi’an, Shaanxi 710071, China; Xi’an Key Laboratory of Big Data and Intelligent Vision, Xi’an, Shaanxi 710071, China; Key Laboratory of Collaborative Intelligence Systems, Ministry of Education, Xidian University, Xi’an, Shaanxi 710071, China
Q
Quan Wang
School of Computer Science and Technology, Xidian University, Xi’an, Shaanxi 710071, China
Z
Zongkai Yang
National Engineering Research Center of Educational Big Data and Faculty of Artificial Intelligence in Education, Central China Normal University, Wuhan 430079, China