Roleplaying with Structure: Synthetic Therapist-Client Conversation Generation from Questionnaires

πŸ“… 2025-10-29
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Privacy regulations and scarcity of authentic therapeutic dialogue data hinder clinical validity in mental health AI research. To address this, we propose SQPsych: a framework that constructs client profiles from structured psychological questionnaires and integrates cognitive behavioral therapy (CBT) principles to guide open-weight large language models (LLMs) in generating role-play-based synthetic dialogues; it supports local deployment to mitigate data security risks associated with closed-source models. The resulting SQPsychConv corpus undergoes joint evaluation by clinical experts and LLMs, demonstrating high clinical authenticity. Fine-tuning an LLM on SQPsychConv yields SQPsychLLM, which significantly outperforms baseline models in counseling competency and achieves consistent performance gains across diverse mental health support tasks. This work establishes the first synthetic dialogue paradigm that simultaneously ensures privacy compliance, clinical alignment, and scalability.

Technology Category

Application Category

πŸ“ Abstract
The development of AI for mental health is hindered by a lack of authentic therapy dialogues, due to strict privacy regulations and the fact that clinical sessions were historically rarely recorded. We present an LLM-driven pipeline that generates synthetic counseling dialogues based on structured client profiles and psychological questionnaires. Grounded on the principles of Cognitive Behavioral Therapy (CBT), our method creates synthetic therapeutic conversations for clinical disorders such as anxiety and depression. Our framework, SQPsych (Structured Questionnaire-based Psychotherapy), converts structured psychological input into natural language dialogues through therapist-client simulations. Due to data governance policies and privacy restrictions prohibiting the transmission of clinical questionnaire data to third-party services, previous methodologies relying on proprietary models are infeasible in our setting. We address this limitation by generating a high-quality corpus using open-weight LLMs, validated through human expert evaluation and LLM-based assessments. Our SQPsychLLM models fine-tuned on SQPsychConv achieve strong performance on counseling benchmarks, surpassing baselines in key therapeutic skills. Our findings highlight the potential of synthetic data to enable scalable, data-secure, and clinically informed AI for mental health support. We will release our code, models, and corpus at https://ai-mh.github.io/SQPsych
Problem

Research questions and friction points this paper is trying to address.

Generating synthetic therapy dialogues using structured questionnaires and client profiles
Overcoming data privacy restrictions with open-weight LLMs for mental health AI
Creating clinically-grounded CBT conversations for anxiety and depression treatment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates synthetic therapy dialogues using structured profiles
Employs open-weight LLMs for privacy-compliant data generation
Fine-tunes models on CBT-grounded therapist-client simulations
πŸ”Ž Similar Papers
No similar papers found.
D
Doan Nam Long Vu
Technical University of Darmstadt
Rui Tan
Rui Tan
Professor, Nanyang Technological University
SensingsecurityInternet of Thingscyber-physical systems
L
Lena Moench
Justus Liebig University Giessen
S
Svenja Jule Francke
Philipps-University Marburg
D
Daniel Woiwod
Philipps-University Marburg
F
Florian Thomas-Odenthal
Philipps-University Marburg
S
Sanna Stroth
Philipps-University Marburg
T
Tilo Kircher
Philipps-University Marburg
C
Christiane Hermann
Justus Liebig University Giessen
U
Udo Dannlowski
University of MΓΌnster
Hamidreza Jamalabadi
Hamidreza Jamalabadi
Philipps-University Marburg
Shaoxiong Ji
Shaoxiong Ji
Technical University of Darmstadt
Machine LearningNatural Language ProcessingHealth Informatics