🤖 AI Summary
This study addresses potential subgroup discrimination—particularly against individuals with low health literacy or high self-efficacy—in multi-turn LLM-based health coaching, where personalized decision-making may inadvertently exacerbate disparities. To this end, we propose a subgroup-aware offline policy evaluation (OPE) framework. Methodologically, we design a factorized decision head decoupling tool selection from interaction style, construct a latent trait prototypical simulator, and incorporate an early information-gain reward to accelerate user trait identification. We further integrate a lightweight simulator with typified rewards—objective tool efficacy and subjective satisfaction—to jointly model multidimensional objectives. Experiments reveal that conventional uniform reweighting strategies improve aggregate log-value but degrade performance for vulnerable subgroups. In contrast, our approach significantly mitigates subgroup performance gaps, reduces trait identification latency, and improves both goal achievement rate and pass@3 accuracy—establishing a novel paradigm for fair, interpretable LLM-driven health interventions.
📝 Abstract
We study a web-deployed, tool-augmented LLM health coach with real users. In a pilot with seven users (280 rated turns), offline policy evaluation (OPE) over factorized decision heads (Tool/Style) shows that a uniform heavy-tool policy raises average value on logs but harms specific subgroups, most notably low-health-literacy/high-self-efficacy users. A lightweight simulator with hidden archetypes further shows that adding a small early information-gain bonus reliably shortens trait identification and improves goal success and pass@3. Together, these early findings indicate an evaluation-first path to personalization: freeze the generator, learn subgroup-aware decision heads on typed rewards (objective tool outcomes and satisfaction), and always report per-archetype metrics to surface subgroup harms that averages obscure.