MPT: Motion Prompt Tuning for Micro-Expression Recognition

📅 2025-08-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Micro-expression recognition (MER) faces two key challenges: severe scarcity of labeled data and the inability of existing large models to effectively capture transient, subtle facial movements. To address these, this work introduces prompt learning to MER for the first time, proposing Motion Prompt Tuning (MPT). MPT enhances dynamic features via motion amplification, employs Gaussian tokenization to generate temporally aware motion prompts, and incorporates a grouped adapter to improve the large model’s adaptability to discriminative micro-expression representations. Crucially, MPT requires no modification to the backbone architecture and achieves efficient few-shot transfer learning using only lightweight, plug-and-play modules. Extensive experiments on three benchmark datasets—CASME II, SAMM, and SMIC—demonstrate significant improvements over state-of-the-art methods, validating both effectiveness and cross-dataset generalizability.

Technology Category

Application Category

📝 Abstract
Micro-expression recognition (MER) is crucial in the affective computing field due to its wide application in medical diagnosis, lie detection, and criminal investigation. Despite its significance, obtaining micro-expression (ME) annotations is challenging due to the expertise required from psychological professionals. Consequently, ME datasets often suffer from a scarcity of training samples, severely constraining the learning of MER models. While current large pre-training models (LMs) offer general and discriminative representations, their direct application to MER is hindered by an inability to capture transitory and subtle facial movements-essential elements for effective MER. This paper introduces Motion Prompt Tuning (MPT) as a novel approach to adapting LMs for MER, representing a pioneering method for subtle motion prompt tuning. Particularly, we introduce motion prompt generation, including motion magnification and Gaussian tokenization, to extract subtle motions as prompts for LMs. Additionally, a group adapter is carefully designed and inserted into the LM to enhance it in the target MER domain, facilitating a more nuanced distinction of ME representation. Furthermore, extensive experiments conducted on three widely used MER datasets demonstrate that our proposed MPT consistently surpasses state-of-the-art approaches and verifies its effectiveness.
Problem

Research questions and friction points this paper is trying to address.

Addresses micro-expression recognition with limited training samples
Adapts large pre-training models for subtle facial motion analysis
Overcomes challenges in capturing transient micro-expression features
Innovation

Methods, ideas, or system contributions that make the work stand out.

Motion Prompt Tuning adapts LMs
Motion magnification extracts subtle motions
Group adapter enhances MER representation
🔎 Similar Papers
No similar papers found.
Jiateng Liu
Jiateng Liu
PhD student @ UIUC
NLP and its applicationsMLLMLLM agent
Hengcan Shi
Hengcan Shi
Hunan University
Computer VisionMachine Learning
F
Feng Chen
Australian Institute for Machine Learning, the University of Adelaide
Z
Zhiwen Shao
School of Computer Science and Technology, China University of Mining and Mine Digitization Engineering Research Center of the Ministry of Education, Xuzhou 221116, China
Y
Yaonan Wang
School of Artificial Intelligence and Robotics, Hunan University
Jianfei Cai
Jianfei Cai
Professor of Data Science & AI, Monash University
Visual computingmultimediacomputer visionmultimedia networking
Wenming Zheng
Wenming Zheng
Southeast University
Affective ComputingPattern RecognitionComputer Vision