🤖 AI Summary
Wearable devices generate high-dimensional physiological time-series data, yet the temporal-scale mechanisms underlying their predictive utility remain poorly understood. To address this, we propose HiMAE—a hierarchical masked autoencoder that elevates temporal resolution from a hyperparameter to an interpretable probe. Leveraging a hierarchical convolutional encoder-decoder architecture and a novel hierarchical masking strategy, HiMAE autonomously discovers task-relevant temporal scale structures, enabling effective multi-scale representation learning. The framework supports classification, regression, and generative tasks, and is plug-and-play after pretraining. On diverse clinical and behavioral prediction benchmarks, HiMAE outperforms existing large foundation models while using 10–100× fewer parameters. Moreover, it achieves sub-millisecond inference latency on smartwatch CPUs, demonstrating both scientific interpretability and practical edge-deployment feasibility.
📝 Abstract
Wearable sensors provide abundant physiological time series, yet the principles governing their predictive utility remain unclear. We hypothesize that temporal resolution is a fundamental axis of representation learning, with different clinical and behavioral outcomes relying on structure at distinct scales. To test this resolution hypothesis, we introduce HiMAE (Hierarchical Masked Autoencoder), a self supervised framework that combines masked autoencoding with a hierarchical convolutional encoder decoder. HiMAE produces multi resolution embeddings that enable systematic evaluation of which temporal scales carry predictive signal, transforming resolution from a hyperparameter into a probe for interpretability. Across classification, regression, and generative benchmarks, HiMAE consistently outperforms state of the art foundation models that collapse scale, while being orders of magnitude smaller. HiMAE is an efficient representation learner compact enough to run entirely on watch, achieving sub millisecond inference on smartwatch class CPUs for true edge inference. Together, these contributions position HiMAE as both an efficient self supervised learning method and a discovery tool for scale sensitive structure in wearable health.