Collaborative Temporal Feature Generation via Critic-Free Reinforcement Learning for Cross-User Sensor-Based Activity Recognition

📅 2026-03-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of cross-user generalization in human activity recognition using wearable sensors by proposing a collaborative temporal feature generation framework. The approach models universal feature extraction as an autoregressive generation process and introduces a critic-free Group-Relative Policy Optimization algorithm, which replaces value function estimation with intra-group normalization to eliminate distribution-dependent bias. A three-objective reward mechanism—encompassing class discriminability, user invariance, and temporal fidelity—is integrated to optimize a Transformer-based generator for producing robust feature sequences. Evaluated on the DSADS and PAMAP2 datasets, the method achieves cross-user accuracies of 88.53% and 75.22%, respectively, significantly reducing training variance, accelerating convergence, and demonstrating strong generalization across diverse action spaces.

Technology Category

Application Category

📝 Abstract
Human Activity Recognition using wearable inertial sensors is foundational to healthcare monitoring, fitness analytics, and context-aware computing, yet its deployment is hindered by cross-user variability arising from heterogeneous physiological traits, motor habits, and sensor placements. Existing domain generalization approaches either neglect temporal dependencies in sensor streams or depend on impractical target-domain annotations. We propose a different paradigm: modeling generalizable feature extraction as a collaborative sequential generation process governed by reinforcement learning. Our framework, CTFG (Collaborative Temporal Feature Generation), employs a Transformer-based autoregressive generator that incrementally constructs feature token sequences, each conditioned on prior context and the encoded sensor input. The generator is optimized via Group-Relative Policy Optimization, a critic-free algorithm that evaluates each generated sequence against a cohort of alternatives sampled from the same input, deriving advantages through intra-group normalization rather than learned value estimation. This design eliminates the distribution-dependent bias inherent in critic-based methods and provides self-calibrating optimization signals that remain stable across heterogeneous user distributions. A tri-objective reward comprising class discrimination, cross-user invariance, and temporal fidelity jointly shapes the feature space to separate activities, align user distributions, and preserve fine-grained temporal content. Evaluations on the DSADS and PAMAP2 benchmarks demonstrate state-of-the-art cross-user accuracy (88.53\% and 75.22\%), substantial reduction in inter-task training variance, accelerated convergence, and robust generalization under varying action-space dimensionalities.
Problem

Research questions and friction points this paper is trying to address.

Human Activity Recognition
Cross-User Variability
Domain Generalization
Temporal Dependencies
Wearable Sensors
Innovation

Methods, ideas, or system contributions that make the work stand out.

critic-free reinforcement learning
collaborative feature generation
cross-user generalization
temporal feature modeling
Group-Relative Policy Optimization
🔎 Similar Papers
No similar papers found.
X
Xiaozhou Ye
School of Artificial Intelligence (School of Future Technology), Nanjing University of Information Science and Technology, Nanjing, China
Feng Jiang
Feng Jiang
Shenzhen University of Advanced Technology
Discourse ParsingLarge-scale Language ModelDialogue System
Z
Zihan Wang
School of Cyberspace Security, Nanjing University of Information Science and Technology, Nanjing, China
X
Xiulai Wang
Jinling Hospital, Affiliated Hospital of Medical School, Nanjing University, Nanjing, China; School of Artificial Intelligence (School of Future Technology), Nanjing University of Information Science and Technology, Nanjing, China
Yutao Zhang
Yutao Zhang
Moonshot AI
Kevin I-Kai Wang
Kevin I-Kai Wang
Department of Electrical, Computer, and Software Engineering, The University of Auckland
Wireless Sensor NetworkUbiquitous ComputingPervasive HealthcareMachine Learning