Can LLMs Address Mental Health Questions? A Comparison with Human Therapists

📅 2025-09-15

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

Prior research lacks systematic, empirically grounded comparisons between large language models (LLMs) and licensed psychotherapists in authentic clinical query scenarios. Method: This study conducts the first comparative evaluation of ChatGPT, Gemini, and Llama against human therapists using real patient questions, integrating computational text analysis (assessing readability, sentiment polarity) with user surveys (measuring perceived supportiveness, respectfulness, and acceptability). Contribution/Results: LLM responses significantly outperformed human therapists in linguistic clarity, respectfulness, and supportive tone. However, both end users and clinical experts consistently preferred human therapists for emotional depth, therapeutic alliance formation, and privacy assurance. The findings delineate the viable scope—namely, lightweight, adjunctive mental health support—while highlighting critical limitations concerning relational authenticity, contextual nuance, and ethical safeguards. This work provides empirical grounding for defining appropriate boundaries and designing ethically robust AI-augmented psychological services.

Technology Category

Application Category

📝 Abstract

Limited access to mental health care has motivated the use of digital tools and conversational agents powered by large language models (LLMs), yet their quality and reception remain unclear. We present a study comparing therapist-written responses to those generated by ChatGPT, Gemini, and Llama for real patient questions. Text analysis showed that LLMs produced longer, more readable, and lexically richer responses with a more positive tone, while therapist responses were more often written in the first person. In a survey with 150 users and 23 licensed therapists, participants rated LLM responses as clearer, more respectful, and more supportive than therapist-written answers. Yet, both groups of participants expressed a stronger preference for human therapist support. These findings highlight the promise and limitations of LLMs in mental health, underscoring the need for designs that balance their communicative strengths with concerns of trust, privacy, and accountability.

Problem

Research questions and friction points this paper is trying to address.

Evaluating LLM effectiveness in mental health responses

Comparing AI-generated and human therapist answer quality

Assessing user preference between AI and human support

Innovation

Methods, ideas, or system contributions that make the work stand out.

Using LLMs to generate mental health responses

Comparing multiple AI models against human therapists

Balancing AI communication strengths with ethical concerns

🔎 Similar Papers

No similar papers found.