🤖 AI Summary
High annotation cost and difficulty in labeling subtle motions severely limit few-shot user perception performance using IMU data. Method: We propose a multi-granularity semantic self-supervised pretraining framework that integrates a lightweight IMU feature extractor with hierarchical semantic modeling, and employs Bayesian optimization to dynamically weight diverse self-supervised tasks—spanning motion segments, behavioral patterns, and contextual scenes—to enhance representation discriminability. Contribution/Results: With only ~100 labeled samples per class, our method achieves >90% accuracy across three representative user perception tasks—matching the performance of fully supervised models trained on tens of thousands of labeled samples—while incurring zero additional system overhead. To our knowledge, this is the first work to enable efficient fine-grained semantic perception from IMU signals under few-shot settings, establishing a new paradigm for low-resource wearable sensing.
📝 Abstract
Inertial measurement units (IMUs), have been prevalently used in a wide range of mobile perception applications such as activity recognition and user authentication, where a large amount of labelled data are normally required to train a satisfactory model. However, it is difficult to label micro-activities in massive IMU data due to the hardness of understanding raw IMU data and the lack of ground truth. In this paper, we propose a novel fine-grained user perception approach, called Saga, which only needs a small amount of labelled IMU data to achieve stunning user perception accuracy. The core idea of Saga is to first pre-train a backbone feature extraction model, utilizing the rich semantic information of different levels embedded in the massive unlabelled IMU data. Meanwhile, for a specific downstream user perception application, Bayesian Optimization is employed to determine the optimal weights for pre-training tasks involving different semantic levels. We implement Saga on five typical mobile phones and evaluate Saga on three typical tasks on three IMU datasets. Results show that when only using about 100 training samples per class, Saga can achieve over 90% accuracy of the full-fledged model trained on over ten thousands training samples with no additional system overhead.