🤖 AI Summary
This work addresses the limitations of existing wearable foundation models, which predominantly rely on static encoders for short-term retrospective prediction and struggle to effectively model chronic, progressive, or intermittent health conditions. To overcome these challenges, the study proposes three paradigm shifts: establishing a structured longitudinal multimodal data ecosystem, developing multimodal modeling techniques tailored for long-horizon time series, and designing an agent-based reasoning system capable of supporting clinical interventions. By integrating long-context modeling, temporal abstraction, personalized inference, and intelligent decision-making under uncertainty, this approach enables a transition from retrospective monitoring to prospective health reasoning. The resulting framework provides continuous, goal-aligned support for long-term health issues such as chronic diseases, substantially enhancing the clinical applicability and forward-looking capabilities of wearable systems in real-world settings.
📝 Abstract
Wearable foundation models (WFMs), trained on large volumes of data collected by affordable, always-on devices, have demonstrated strong performance on short-term, well-defined health monitoring tasks, including activity recognition, fitness tracking, and cardiovascular signal assessment. However, most existing WFMs primarily map short temporal windows to predefined labels via static encoders, emphasizing retrospective prediction rather than reasoning over evolving personal history, context, and future risk trajectories. As a result, they are poorly suited for modeling chronic, progressive, or episodic health conditions that unfold over weeks, months or years. Hence, we argue that WFMs must move beyond static encoders and be explicitly designed for longitudinal, anticipatory health reasoning. We identify three foundational shifts required to enable this transition: (1) Structurally rich data, which goes beyond isolated datasets or outcome-conditioned collection to integrated multimodal, long-term personal trajectories, and contextual metadata, ideally supported by open and interoperable data ecosystems; (2) Longitudinal-aware multimodal modeling, which prioritizes long-context inference, temporal abstraction, and personalization over cross-sectional or population-level prediction; and (3) Agentic inference systems, which move beyond static prediction to support planning, decision-making, and clinically grounded intervention under uncertainty. Together, these shifts reframe wearable health monitoring from retrospective signal interpretation toward continuous, anticipatory, and human-aligned health support.