Towards Privacy-Preserving Mental Health Support with Large Language Models

📅 2026-01-05
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the scarcity of authentic counseling data and privacy sensitivities hindering the deployment of large language models in mental health support. To this end, the authors propose MindChat, a framework that generates high-fidelity synthetic dialogues—termed MindCorpus—through multi-agent role-playing augmented with a dual-loop feedback mechanism comprising turn-level critique-and-revision and conversation-level strategy optimization. The model is then trained using an integrated privacy-preserving pipeline combining LoRA-based parameter-efficient fine-tuning, federated learning, and differential privacy. Experimental results demonstrate that MindChat matches or rivals existing baselines in both automatic and human evaluations while substantially reducing vulnerability to membership inference attacks, thereby validating the efficacy of synthetic data and its capacity to achieve a favorable trade-off between utility and privacy preservation.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have shown promise for mental health support, yet training such models is constrained by the scarcity and sensitivity of real counseling dialogues. In this article, we present MindChat, a privacy-preserving LLM for mental health support, together with MindCorpus, a synthetic multi-turn counseling dataset constructed via a multi-agent role-playing framework. To synthesize high-quality counseling data, the developed dialogue-construction framework employs a dual closed-loop feedback design to integrate psychological expertise and counseling techniques through role-playing: (i) turn-level critique-and-revision to improve coherence and counseling appropriateness within a session, and (ii) session-level strategy refinement to progressively enrich counselor behaviors across sessions. To mitigate privacy risks under decentralized data ownership, we fine-tune the base model using federated learning with parameter-efficient LoRA adapters and incorporate differentially private optimization to reduce membership and memorization risks. Experiments on synthetic-data quality assessment and counseling capability evaluation show that MindCorpus improves training effectiveness and that MindChat is competitive with existing general and counseling-oriented LLM baselines under both automatic LLM-judge and human evaluation protocols, while exhibiting reduced privacy leakage under membership inference attacks.
Problem

Research questions and friction points this paper is trying to address.

privacy-preserving
large language model
mental health support
synthetic counseling data
data scarcity
Innovation

Methods, ideas, or system contributions that make the work stand out.

privacy-preserving
synthetic counseling data
multi-agent role-playing
federated learning
differential privacy
🔎 Similar Papers
No similar papers found.
Dong Xue
Dong Xue
Associate Professor of Automation, East China University of Science and Technology
multi-agent systemscomplex networkdistributed control and optimizationopinion dynamics in social networkspower systems
J
Jicheng Tu
Key Laboratory of Smart Manufacturing in Energy Chemical Process, Ministry of Education, East China University of Science and Technology, Shanghai, 200237, P .R. China
Ming Wang
Ming Wang
Ph.D. student of Data Mining Group, Northeastern University - Shenyang
Machine PsychologyAI for Mental HealthLLM-based Agents
Xin Yan
Xin Yan
Missouri University of S&T, Google
Fangzhou Liu
Fangzhou Liu
Harbin Institute of Technology
Control TheoryComplex NetworksSystem Science
J
Jie Hu
Shanghai Key Laboratory of Mental Health and Psychological Crisis Intervention, School of Psychology and Cognitive Science, East China Normal University, Shanghai, 200062, P .R. China