🤖 AI Summary
Micro-expression recognition (MER) is significantly more challenging than macro-expression recognition due to its extremely short duration and low intensity; existing methods often rely on a single prior, limiting effective integration of heterogeneous information. To address this, we propose MPFNet, a Multi-Prior Fusion Network featuring a novel dual-path architecture (MPFNet-P/C) inspired by developmental psychology—supporting both parallel and hierarchical, cognition-inspired prior fusion paradigms. MPFNet employs progressive training to jointly model spatiotemporal and channel-wise features. It integrates an I3D backbone, coordinate attention, and dual encoders (Global Feature Encoder and Appearance Feature Encoder). Evaluated on SMIC, CASME II, and SAMM, MPFNet achieves 81.1%, 92.4%, and 85.7% accuracy, respectively—setting new state-of-the-art results on SMIC and SAMM. This work represents the first successful collaborative modeling and progressive optimization of multi-source priors in MER.
📝 Abstract
Micro-expression recognition (MER), a critical subfield of affective computing, presents greater challenges than macro-expression recognition due to its brief duration and low intensity. While incorporating prior knowledge has been shown to enhance MER performance, existing methods predominantly rely on simplistic, singular sources of prior knowledge, failing to fully exploit multi-source information. This paper introduces the Multi-Prior Fusion Network (MPFNet), leveraging a progressive training strategy to optimize MER tasks. We propose two complementary encoders: the Generic Feature Encoder (GFE) and the Advanced Feature Encoder (AFE), both based on Inflated 3D ConvNets (I3D) with Coordinate Attention (CA) mechanisms, to improve the model's ability to capture spatiotemporal and channel-specific features. Inspired by developmental psychology, we present two variants of MPFNet--MPFNet-P and MPFNet-C--corresponding to two fundamental modes of infant cognitive development: parallel and hierarchical processing. These variants enable the evaluation of different strategies for integrating prior knowledge. Extensive experiments demonstrate that MPFNet significantly improves MER accuracy while maintaining balanced performance across categories, achieving accuracies of 0.811, 0.924, and 0.857 on the SMIC, CASME II, and SAMM datasets, respectively. To the best of our knowledge, our approach achieves state-of-the-art performance on the SMIC and SAMM datasets.