Frequency-Aware Masked Autoencoders for Human Activity Recognition using Accelerometers

📅 2025-02-17
📈 Citations: 0
Influential: 0
📄 PDF

career value

201K/year
🤖 AI Summary
To address the scarcity of labeled wearable accelerometer data—limiting the performance of human activity recognition (HAR)—this paper proposes a temporal Transformer-based masked autoencoder (MAE) framework for self-supervised pretraining tailored to HAR. The core contribution is a novel log-scale mean magnitude (LMM) spectral loss function, designed to overcome the limitations of conventional mean squared error (MSE) in frequency-domain modeling, thereby significantly improving pretraining convergence and downstream generalization. Pretrained on 109,000 unlabeled samples from UK Biobank, the model achieves a balanced accuracy of 0.848 on downstream HAR tasks using only linear probe fine-tuning—surpassing the MSE baseline (0.709) by 13.9 percentage points and establishing a new state-of-the-art for self-supervised HAR.

Technology Category

Application Category

📝 Abstract
Wearable accelerometers are widely used for continuous monitoring of physical activity. Supervised machine learning and deep learning algorithms have long been used to extract meaningful activity information from raw accelerometry data, but progress has been hampered by the limited amount of publicly available labeled data. Exploiting large unlabeled datasets using self-supervised pretraining is a relatively new and underexplored approach in the field of human activity recognition (HAR). We used a time-series transformer masked autoencoder (MAE) approach to self-supervised pretraining and propose a novel spectrogram-based loss function named the log-scale mean magnitude (LMM) loss. We compared MAE models pretrained with LMM to one trained with the mean squared error (MSE) loss. We leveraged the large unlabeled UK Biobank accelerometry dataset (n = 109k) for pretraining and evaluated downstream HAR performance using linear classifier in a smaller labelled dataset. We found that pretraining with the LMM loss improved performance compared to a model pretrained with the MSE loss, with balanced accuracies of 0.848 and 0.709, respectively. Further analysis revealed that better convergence of the LMM loss, but not the MSE loss significantly correlated with improved downstream performance (r=-0.61, p=0.04) for balanced accuracy). Finally, we compared our MAE models to the state-of-the-art for HAR, also pretrained on the UK Biobank accelerometry data. Our LMM-pretrained models performed better when finetuned using a linear classifier and performed comparably when finetuned using an LSTM classifier, while MSE-pretrained models consistently underperformed. Our findings demonstrate that the LMM loss is a robust and effective method for pretraining MAE models on accelerometer data for HAR. Future work should explore optimizing loss function combinations and extending our approach to other tasks.
Problem

Research questions and friction points this paper is trying to address.

Self-supervised pretraining for HAR
Novel LMM loss for accelerometer data
Improving performance with large unlabeled datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Time-series transformer masked autoencoder
Spectrogram-based LMM loss function
Self-supervised pretraining on UK Biobank
🔎 Similar Papers
No similar papers found.
N
Niels R. Lorenzen
Department of Health Technology, Technical University of Denmark, Lyngby, Denmark; Danish Centre for Sleep Medicine, Copenhagen University Hospital-Rigshospitalet, Glostrup, Denmark; Department of Psychiatry and Behavioral Sciences, Stanford University, CA, USA
P
P. J. Jennum
Danish Centre for Sleep Medicine, Copenhagen University Hospital-Rigshospitalet, Glostrup, Denmark
Emmanuel Mignot
Emmanuel Mignot
Stanford University Professor
sleepimmunologygeneticsneuroscienceengineering
A
A. Brink-Kjaer
Department of Health Technology, Technical University of Denmark, Lyngby, Denmark