From Continuous sEMG Signals to Discrete Muscle State Tokens: A Robust and Interpretable Representation Framework

📅 2026-02-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenges of poor robustness and limited interpretability in surface electromyography (sEMG) signal decoding, which stem from high inter-subject variability and noise sensitivity. The authors propose a physiology-inspired discretization framework for sEMG: by aligning sliding windows to the minimal muscle contraction cycle, they extract ten-dimensional time–frequency features (e.g., RMS, MDF) and apply K-means clustering to generate muscle state tokens, establishing the first physiologically driven sEMG tokenization approach. Evaluated on the newly released multi-action, multi-muscle dataset ActionEMG-43, the method achieves high cross-subject consistency (Cohen’s Kappa = 0.82 ± 0.09) and yields Top-1 action recognition accuracies of 75.5% with a Vision Transformer and 67.9% with an SVM—substantially outperforming raw-signal baselines—while reducing input dimensionality by 96% and enabling interpretable analysis of movement quality.

Technology Category

Application Category

📝 Abstract
Surface electromyography (sEMG) signals exhibit substantial inter-subject variability and are highly susceptible to noise, posing challenges for robust and interpretable decoding. To address these limitations, we propose a discrete representation of sEMG signals based on a physiology-informed tokenization framework. The method employs a sliding window aligned with the minimal muscle contraction cycle to isolate individual muscle activation events. From each window, ten time-frequency features, including root mean square (RMS) and median frequency (MDF), are extracted, and K-means clustering is applied to group segments into representative muscle-state tokens. We also introduce a large-scale benchmark dataset, ActionEMG-43, comprising 43 diverse actions and sEMG recordings from 16 major muscle groups across the body. Based on this dataset, we conduct extensive evaluations to assess the inter-subject consistency, representation capacity, and interpretability of the proposed sEMG tokens. Our results show that the token representation exhibits high inter-subject consistency (Cohen's Kappa = 0.82+-0.09), indicating that the learned tokens capture consistent and subject-independent muscle activation patterns. In action recognition tasks, models using sEMG tokens achieve Top-1 accuracies of 75.5% with ViT and 67.9% with SVM, outperforming raw-signal baselines (72.8% and 64.4%, respectively), despite a 96% reduction in input dimensionality. In movement quality assessment, the tokens intuitively reveal patterns of muscle underactivation and compensatory activation, offering interpretable insights into neuromuscular control. Together, these findings highlight the effectiveness of tokenized sEMG representations as a compact, generalizable, and physiologically meaningful feature space for applications in rehabilitation, human-machine interaction, and motor function analysis.
Problem

Research questions and friction points this paper is trying to address.

sEMG
inter-subject variability
noise susceptibility
robust decoding
interpretable representation
Innovation

Methods, ideas, or system contributions that make the work stand out.

sEMG tokenization
physiology-informed representation
muscle state tokens
inter-subject consistency
ActionEMG-43
🔎 Similar Papers
No similar papers found.
Y
Yuepeng Chen
School of Computer Science (National Pilot Software Engineering School), Beijing University of Posts and Telecommunications, Beijing, China; Key Laboratory of Trustworthy Distributed Computing and Service (BUPT), Ministry of Education, Beijing, China
K
Kaili Zheng
Department of Electronic Engineering, Tsinghua University, Beijing, China
Ji Wu
Ji Wu
Tsinghua University
Artificial Intelligence,smart healthcaremachine learningpattern recognitionspeech recognition
Z
Zhuangzhuang Li
Department of Electronic Engineering, Tsinghua University, Beijing, China
Y
Ye Ma
Research Academy of Grand Health, Ningbo University, Ningbo, China
D
Dongwei Liu
School of Information Technology and Artificial Intelligence, Zhejiang University of Finance and Economics, Hangzhou, China
C
Chenyi Guo
Department of Electronic Engineering, Tsinghua University, Beijing, China; Institute for Precision Medicine, Tsinghua University, Beijing, China
X
Xiangling Fu
School of Computer Science (National Pilot Software Engineering School), Beijing University of Posts and Telecommunications, Beijing, China; Key Laboratory of Trustworthy Distributed Computing and Service (BUPT), Ministry of Education, Beijing, China