🤖 AI Summary
To address the time-consuming, manual-parameter-dependent, and computationally intensive tuning inherent in conventional approaches for personalized hip exoskeleton assistance, this paper proposes an online reward learning framework based on active pairwise preference feedback. The method requires no prior dynamics model or hand-tuned parameters; instead, it synthesizes user-specific torque profiles within minutes via randomized torque trajectory generation, Bayesian preference modeling, motion synchrony constraints, and quantified negative work evaluation. Validated on eight healthy participants, the learned control policy achieves high preference consistency, preserves natural joint coordination, and significantly reduces device-induced negative work. This work represents the first application of pairwise preference learning to real-time adaptive control of wearable exoskeletons, establishing a new paradigm for personalization that is efficient, interpretable, and low-burden for users.
📝 Abstract
Hip exoskeletons are increasing in popularity due to their effectiveness across various scenarios and their ability to adapt to different users. However, personalizing the assistance often requires lengthy tuning procedures and computationally intensive algorithms, and most existing methods do not incorporate user feedback. In this work, we propose a novel approach for rapidly learning users' preferences for hip exoskeleton assistance. We perform pairwise comparisons of distinct randomly generated assistive profiles, and collect participants preferences through active querying. Users' feedback is integrated into a preference-learning algorithm that updates its belief, learns a user-dependent reward function, and changes the assistive torque profiles accordingly. Results from eight healthy subjects display distinct preferred torque profiles, and users' choices remain consistent when compared to a perturbed profile. A comprehensive evaluation of users' preferences reveals a close relationship with individual walking strategies. The tested torque profiles do not disrupt kinematic joint synergies, and participants favor assistive torques that are synchronized with their movements, resulting in lower negative power from the device. This straightforward approach enables the rapid learning of users preferences and rewards, grounding future studies on reward-based human-exoskeleton interaction.