MAGneT: Coordinated Multi-Agent Generation of Synthetic Multi-Turn Mental Health Counseling Sessions

📅 2025-09-04

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

High-quality, privacy-compliant psychotherapy dialogue data are scarce, hindering the fine-tuning and deployment of open-source large language models (LLMs) in mental health counseling. To address this, we propose a multi-agent collaborative generation framework that formalizes therapy dialogues as a cognitive behavioral therapy (CBT)-guided pipeline of specialized subtasks, each executed by dedicated LLM agents. We further design a comprehensive nine-dimensional evaluation framework integrating automated metrics and expert human assessment. The generated dialogues significantly outperform baselines in quality, diversity, and therapeutic alignment, achieving a 77.2% expert preference rate. Fine-tuning LLMs on this data yields substantial improvements: +6.3% in CTRS (Cognitive Therapy Rating Scale) general counseling skills and +7.3% in CBT-specific competencies. This work advances data-efficient, clinically grounded LLM adaptation for psychotherapy support.

Technology Category

Application Category

📝 Abstract

The growing demand for scalable psychological counseling highlights the need for fine-tuning open-source Large Language Models (LLMs) with high-quality, privacy-compliant data, yet such data remains scarce. Here we introduce MAGneT, a novel multi-agent framework for synthetic psychological counseling session generation that decomposes counselor response generation into coordinated sub-tasks handled by specialized LLM agents, each modeling a key psychological technique. Unlike prior single-agent approaches, MAGneT better captures the structure and nuance of real counseling. In addition, we address inconsistencies in prior evaluation protocols by proposing a unified evaluation framework integrating diverse automatic and expert metrics. Furthermore, we expand the expert evaluations from four aspects of counseling in previous works to nine aspects, enabling a more thorough and robust assessment of data quality. Empirical results show that MAGneT significantly outperforms existing methods in quality, diversity, and therapeutic alignment of the generated counseling sessions, improving general counseling skills by 3.2% and CBT-specific skills by 4.3% on average on cognitive therapy rating scale (CTRS). Crucially, experts prefer MAGneT-generated sessions in 77.2% of cases on average across all aspects. Moreover, fine-tuning an open-source model on MAGneT-generated sessions shows better performance, with improvements of 6.3% on general counseling skills and 7.3% on CBT-specific skills on average on CTRS over those fine-tuned with sessions generated by baseline methods. We also make our code and data public.

Problem

Research questions and friction points this paper is trying to address.

Generating high-quality synthetic mental health counseling sessions

Addressing data scarcity for privacy-compliant LLM fine-tuning

Improving evaluation protocols for counseling session quality

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent framework for synthetic counseling generation

Specialized LLM agents model psychological techniques

Unified evaluation with expanded expert metrics

🔎 Similar Papers

No similar papers found.