Reinforcement Learning Driven Generalizable Feature Representation for Cross-User Activity Recognition

📅 2025-08-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address degraded generalization in cross-user human activity recognition (HAR) caused by inter-subject variability in motion patterns, sensor placement, and physiological characteristics, this paper proposes a reinforcement learning–driven domain generalization framework. Methodologically, temporal feature extraction is formulated as a sequential decision-making process; autoregressive tokenization preserves intrinsic temporal structure, and a multi-objective reward function—designed without target-domain labels—jointly optimizes inter-class discriminability and user-invariance. The model integrates Transformer architectures with autoregressive generative mechanisms to learn robust, unsupervised representations. Evaluated on DSADS and PAMAP2 benchmarks, the approach significantly outperforms state-of-the-art methods in cross-user HAR accuracy, achieving high performance without subject-specific calibration. This enhances practicality and scalability in real-world wearable applications.

Technology Category

Application Category

📝 Abstract
Human Activity Recognition (HAR) using wearable sensors is crucial for healthcare, fitness tracking, and smart environments, yet cross-user variability -- stemming from diverse motion patterns, sensor placements, and physiological traits -- hampers generalization in real-world settings. Conventional supervised learning methods often overfit to user-specific patterns, leading to poor performance on unseen users. Existing domain generalization approaches, while promising, frequently overlook temporal dependencies or depend on impractical domain-specific labels. We propose Temporal-Preserving Reinforcement Learning Domain Generalization (TPRL-DG), a novel framework that redefines feature extraction as a sequential decision-making process driven by reinforcement learning. TPRL-DG leverages a Transformer-based autoregressive generator to produce temporal tokens that capture user-invariant activity dynamics, optimized via a multi-objective reward function balancing class discrimination and cross-user invariance. Key innovations include: (1) an RL-driven approach for domain generalization, (2) autoregressive tokenization to preserve temporal coherence, and (3) a label-free reward design eliminating the need for target user annotations. Evaluations on the DSADS and PAMAP2 datasets show that TPRL-DG surpasses state-of-the-art methods in cross-user generalization, achieving superior accuracy without per-user calibration. By learning robust, user-invariant temporal patterns, TPRL-DG enables scalable HAR systems, facilitating advancements in personalized healthcare, adaptive fitness tracking, and context-aware environments.
Problem

Research questions and friction points this paper is trying to address.

Addressing cross-user variability in wearable sensor activity recognition
Overcoming overfitting to user-specific patterns in supervised learning
Eliminating dependency on domain labels for temporal generalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement learning driven domain generalization framework
Transformer-based autoregressive tokenization for temporal coherence
Multi-objective reward function balancing discrimination and invariance
🔎 Similar Papers
No similar papers found.
X
Xiaozhou Ye
Department of Electrical, Computer, and Software Engineering, The University of Auckland, Auckland, New Zealand
Kevin I-Kai Wang
Kevin I-Kai Wang
Department of Electrical, Computer, and Software Engineering, The University of Auckland
Wireless Sensor NetworkUbiquitous ComputingPervasive HealthcareMachine Learning