PRMB: Benchmarking Reward Models in Long-Horizon CBT-based Counseling Dialogue

📅 2026-03-11

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

Existing reward models struggle to effectively evaluate alignment with therapeutic goals in long-term cognitive behavioral therapy (CBT) dialogues and lack evaluation benchmarks covering multi-stage interventions. To address this gap, this work proposes PRMB—the first reward model evaluation framework tailored for multi-turn CBT counseling—featuring six-session trajectories across 21 negative scenarios and supporting both pairwise and Best-of-N preference assessments. Leveraging multi-session dialogue data, preference learning paradigms, and generative reward modeling, PRMB not only exposes the limited cross-session generalization capabilities of current models but also demonstrates the promise of generative reward architectures. Experiments show that PRMB scores exhibit a significant positive correlation with downstream counseling performance and effectively uncover generalization issues missed by prior benchmarks, offering a reliable evaluation tool for mental health dialogue systems.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) hold potential for mental healthcare applications, particularly in cognitive behavioral therapy (CBT)-based counseling, where reward models play a critical role in aligning LLMs with preferred therapeutic behaviors. However, existing reward model evaluations often fail to capture alignment effectiveness in long-horizon interventions due to limited coverage of process-oriented datasets and misalignment between evaluation targets and psychological alignment objectives. To address these limitations, we present PRMB, a comprehensive benchmark tailored for evaluating reward models in multi-session CBT counseling. PRMB spans 6 sessions and 21 diverse negative scenarios, incorporating both pairwise and Best-of-N preference evaluations. We demonstrate a positive correlation between our benchmark and downstream counseling dialogue performance. Based on our benchmark, we conduct extensive analysis on the state-of-the-art reward models, revealing their generalization defects that were not discovered by previous benchmarks and highlighting the potential of generative reward models. Furthermore, we delve into examining the effectiveness of inference-time strategy for the evaluation of reward models and analyzing the impact factors of generative reward models. This work advances intelligent informatics for personalized healthcare by establishing a framework for reward model assessment in mental health dialogues. Evaluation code and datasets are publicly available at https://github.com/YouKenChaw/PRMB

Problem

Research questions and friction points this paper is trying to address.

reward models

long-horizon counseling

cognitive behavioral therapy

alignment evaluation

mental healthcare

Innovation

Methods, ideas, or system contributions that make the work stand out.

reward model benchmark

long-horizon CBT counseling

generative reward models