Mental Disorders Detection in the Era of Large Language Models

📅 2024-10-09

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

174K/year

🤖 AI Summary

Accurately identifying depression and anxiety from heterogeneous, real-world textual data—characterized by noise, brevity, limited samples, and genre heterogeneity—remains challenging for conventional NLP models. Method: This study conducts a systematic comparative evaluation of traditional machine learning, psycholinguistically informed lightweight encoder models (e.g., BERT variants), and large language models (LLMs) across five diverse clinical and unstructured text datasets. Contribution/Results: LLMs achieve substantial gains (8–15% absolute accuracy improvement) in low-resource, noisy, and genre-mixed settings, demonstrating robustness under realistic constraints. Conversely, on high-quality, clinically validated texts, lightweight encoders leveraging psycholinguistic features attain 92.3% F1—nearly matching the best LLM (93.1%)—while offering superior computational efficiency and model interpretability. This work is the first to empirically delineate the contextual boundaries of LLM superiority in mental health text classification and to validate the competitive performance of interpretable, feature-driven models on critical clinical subsets.

Technology Category

Application Category

📝 Abstract

This paper compares the effectiveness of traditional machine learning methods, encoder-based models, and large language models (LLMs) on the task of detecting depression and anxiety. Five datasets were considered, each differing in format and the method used to define the target pathology class. We tested AutoML models based on linguistic features, several variations of encoder-based Transformers such as BERT, and state-of-the-art LLMs as pathology classification models. The results demonstrated that LLMs outperform traditional methods, particularly on noisy and small datasets where training examples vary significantly in text length and genre. However, psycholinguistic features and encoder-based models can achieve performance comparable to language models when trained on texts from individuals with clinically confirmed depression, highlighting their potential effectiveness in targeted clinical applications.

Problem

Research questions and friction points this paper is trying to address.

Comparing ML methods for detecting depression and anxiety

Evaluating model performance on Russian-language clinical datasets

Assessing LLMs versus traditional approaches on noisy data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Large language models outperform traditional machine learning methods

LLMs excel on noisy and small datasets with varying text

Psycholinguistic features match LLMs on clinically confirmed depression texts

🔎 Similar Papers

Aligning Large Language Models for Enhancing Psychiatric Interviews Through Symptom Delineation and Summarization: Pilot Study