ActivityNarrated: An Open-Ended Narrative Paradigm for Wearable Human Activity Understanding

📅 2026-04-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing wearable-based human activity recognition methods rely on predefined closed-set categories, limiting their ability to handle open-ended, personalized, and combinatorial activities encountered in real-world scenarios. This work proposes a novel open-domain paradigm centered on natural language activity narratives. By constructing multimodal data from multi-position wearable sensors paired with temporally aligned free-text descriptions, the authors design a language-conditioned neural architecture and introduce a retrieval-based evaluation protocol that does not require fixed activity categories. The resulting framework unifies open-vocabulary understanding with traditional closed-set recognition. In cross-participant evaluations, it achieves a Macro-F1 score of 65.3%, substantially outperforming strong baselines (31–34%), thereby demonstrating its effectiveness and robustness.
📝 Abstract
Wearable HAR has improved steadily, but most progress still relies on closed-set classification, which limits real-world use. In practice, human activity is open-ended, unscripted, personalized, and often compositional, unfolding as narratives rather than instances of fixed classes. We argue that addressing this gap does not require simply scaling datasets or models. It requires a fundamental shift in how wearable HAR is formulated, supervised, and evaluated. This work shows how to model open-ended activity narratives by aligning wearable sensor data with natural-language descriptions in an open-vocabulary setting. Our framework has three core components. First, we introduce a naturalistic data collection and annotation pipeline that combines multi-position wearable sensing with free-form, time-aligned narrative descriptions of ongoing behavior, allowing activity semantics to emerge without a predefined vocabulary. Second, we define a retrieval-based evaluation framework that measures semantic alignment between sensor data and language, enabling principled evaluation without fixed classes while also subsuming closed-set classification as a special case. Third, we present a language-conditioned learning architecture that supports sensor-to-text inference over variable-length sensor streams and heterogeneous sensor placements. Experiments show that models trained with fixed-label objectives degrade sharply under real-world variability, while open-vocabulary sensor-language alignment yields robust and semantically grounded representations. Once this alignment is learned, closed-set activity recognition becomes a simple downstream task. Under cross-participant evaluation, our method achieves 65.3% Macro-F1, compared with 31-34% for strong closed-set HAR baselines. These results establish open-ended narrative modeling as a practical and effective foundation for real-world wearable HAR.
Problem

Research questions and friction points this paper is trying to address.

wearable HAR
open-ended activity
narrative modeling
open-vocabulary
human activity understanding
Innovation

Methods, ideas, or system contributions that make the work stand out.

open-vocabulary activity recognition
sensor-language alignment
narrative-based HAR
retrieval-based evaluation
language-conditioned learning