Saga: Capturing Multi-granularity Semantics from Massive Unlabelled IMU Data for User Perception

📅 2025-04-16

📈 Citations: 0

✨ Influential: 0

career value

235K/year

🤖 AI Summary

High annotation cost and difficulty in labeling subtle motions severely limit few-shot user perception performance using IMU data. Method: We propose a multi-granularity semantic self-supervised pretraining framework that integrates a lightweight IMU feature extractor with hierarchical semantic modeling, and employs Bayesian optimization to dynamically weight diverse self-supervised tasks—spanning motion segments, behavioral patterns, and contextual scenes—to enhance representation discriminability. Contribution/Results: With only ~100 labeled samples per class, our method achieves >90% accuracy across three representative user perception tasks—matching the performance of fully supervised models trained on tens of thousands of labeled samples—while incurring zero additional system overhead. To our knowledge, this is the first work to enable efficient fine-grained semantic perception from IMU signals under few-shot settings, establishing a new paradigm for low-resource wearable sensing.

Technology Category

Application Category

📝 Abstract

Inertial measurement units (IMUs), have been prevalently used in a wide range of mobile perception applications such as activity recognition and user authentication, where a large amount of labelled data are normally required to train a satisfactory model. However, it is difficult to label micro-activities in massive IMU data due to the hardness of understanding raw IMU data and the lack of ground truth. In this paper, we propose a novel fine-grained user perception approach, called Saga, which only needs a small amount of labelled IMU data to achieve stunning user perception accuracy. The core idea of Saga is to first pre-train a backbone feature extraction model, utilizing the rich semantic information of different levels embedded in the massive unlabelled IMU data. Meanwhile, for a specific downstream user perception application, Bayesian Optimization is employed to determine the optimal weights for pre-training tasks involving different semantic levels. We implement Saga on five typical mobile phones and evaluate Saga on three typical tasks on three IMU datasets. Results show that when only using about 100 training samples per class, Saga can achieve over 90% accuracy of the full-fledged model trained on over ten thousands training samples with no additional system overhead.

Problem

Research questions and friction points this paper is trying to address.

Reducing labeled IMU data need for user perception

Extracting multi-granularity semantics from unlabeled IMU data

Optimizing pre-training tasks for downstream applications

Innovation

Methods, ideas, or system contributions that make the work stand out.

Pre-training backbone model with unlabelled IMU data

Bayesian Optimization for optimal semantic level weights

Achieves high accuracy with minimal labelled data

🔎 Similar Papers

C3T: Cross-modal Transfer Through Time for Human Action Recognition