Diffusion Policies with Offline and Inverse Reinforcement Learning for Promoting Physical Activity in Older Adults Using Wearable Sensors

📅 2025-09-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Clinical deployment of activity interventions for older adults at high fall risk is hindered by the difficulty of defining clinically meaningful reward functions and aligning behavioral policies with real-world therapeutic goals. Method: We propose a novel framework integrating inverse reinforcement learning (IRL) and offline reinforcement learning (Offline RL). Specifically, we employ a Kolmogorov–Arnold (KA) network to flexibly model individualized, wearable-sensor-driven reward functions, and embed a diffusion-based policy within an Actor–Critic architecture to enable generative, robust behavioral policy optimization. Contribution/Results: Evaluated on the PEER clinical trial dataset, our method significantly improves activity promotion efficacy over state-of-the-art baselines. It also achieves SOTA performance on the D4RL benchmark. To our knowledge, this is the first work to introduce both KA networks and diffusion policies into healthcare-oriented Offline RL, establishing an interpretable, deployable paradigm for real-world-data-driven health interventions.

Technology Category

Application Category

📝 Abstract
Utilizing offline reinforcement learning (RL) with real-world clinical data is getting increasing attention in AI for healthcare. However, implementation poses significant challenges. Defining direct rewards is difficult, and inverse RL (IRL) struggles to infer accurate reward functions from expert behavior in complex environments. Offline RL also encounters challenges in aligning learned policies with observed human behavior in healthcare applications. To address challenges in applying offline RL to physical activity promotion for older adults at high risk of falls, based on wearable sensor activity monitoring, we introduce Kolmogorov-Arnold Networks and Diffusion Policies for Offline Inverse Reinforcement Learning (KANDI). By leveraging the flexible function approximation in Kolmogorov-Arnold Networks, we estimate reward functions by learning free-living environment behavior from low-fall-risk older adults (experts), while diffusion-based policies within an Actor-Critic framework provide a generative approach for action refinement and efficiency in offline RL. We evaluate KANDI using wearable activity monitoring data in a two-arm clinical trial from our Physio-feedback Exercise Program (PEER) study, emphasizing its practical application in a fall-risk intervention program to promote physical activity among older adults. Additionally, KANDI outperforms state-of-the-art methods on the D4RL benchmark. These results underscore KANDI's potential to address key challenges in offline RL for healthcare applications, offering an effective solution for activity promotion intervention strategies in healthcare.
Problem

Research questions and friction points this paper is trying to address.

Defining direct rewards is difficult in offline RL for healthcare applications
Inverse RL struggles to infer accurate reward functions from expert behavior
Aligning learned policies with observed human behavior in healthcare is challenging
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Kolmogorov-Arnold Networks for reward estimation
Employs diffusion policies for generative action refinement
Combines offline and inverse reinforcement learning for healthcare
🔎 Similar Papers
No similar papers found.