HiMAE: Hierarchical Masked Autoencoders Discover Resolution-Specific Structure in Wearable Time Series

📅 2025-10-28

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

Wearable devices generate high-dimensional physiological time-series data, yet the temporal-scale mechanisms underlying their predictive utility remain poorly understood. To address this, we propose HiMAE—a hierarchical masked autoencoder that elevates temporal resolution from a hyperparameter to an interpretable probe. Leveraging a hierarchical convolutional encoder-decoder architecture and a novel hierarchical masking strategy, HiMAE autonomously discovers task-relevant temporal scale structures, enabling effective multi-scale representation learning. The framework supports classification, regression, and generative tasks, and is plug-and-play after pretraining. On diverse clinical and behavioral prediction benchmarks, HiMAE outperforms existing large foundation models while using 10–100× fewer parameters. Moreover, it achieves sub-millisecond inference latency on smartwatch CPUs, demonstrating both scientific interpretability and practical edge-deployment feasibility.

Technology Category

Application Category

📝 Abstract

Wearable sensors provide abundant physiological time series, yet the principles governing their predictive utility remain unclear. We hypothesize that temporal resolution is a fundamental axis of representation learning, with different clinical and behavioral outcomes relying on structure at distinct scales. To test this resolution hypothesis, we introduce HiMAE (Hierarchical Masked Autoencoder), a self supervised framework that combines masked autoencoding with a hierarchical convolutional encoder decoder. HiMAE produces multi resolution embeddings that enable systematic evaluation of which temporal scales carry predictive signal, transforming resolution from a hyperparameter into a probe for interpretability. Across classification, regression, and generative benchmarks, HiMAE consistently outperforms state of the art foundation models that collapse scale, while being orders of magnitude smaller. HiMAE is an efficient representation learner compact enough to run entirely on watch, achieving sub millisecond inference on smartwatch class CPUs for true edge inference. Together, these contributions position HiMAE as both an efficient self supervised learning method and a discovery tool for scale sensitive structure in wearable health.

Problem

Research questions and friction points this paper is trying to address.

Discovering resolution-specific structure in wearable sensor time series data

Determining which temporal scales carry predictive signals for clinical outcomes

Developing efficient hierarchical autoencoders for edge inference on wearables

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical masked autoencoder for multi-resolution embeddings

Combines masked autoencoding with convolutional encoder-decoder

Enables efficient edge inference on wearable devices

🔎 Similar Papers

No similar papers found.