Are LLMs effective psychological assessors? Leveraging adaptive RAG for interpretable mental health screening through psychometric practice

📅 2025-01-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates the feasibility of leveraging large language models (LLMs) for unsupervised mental health screening, specifically depression detection and automated scoring of standardized clinical instruments (e.g., BDI-II). We propose the first framework that tightly integrates adaptive retrieval-augmented generation (RAG) with classical psychometric principles: given social media text, it performs semantic retrieval to align scale items with authentic user expressions, enabling zero-shot, item-level, traceable scale completion and clinical pathway inference. The method requires no fine-tuning or labeled data, balancing predictive accuracy with clinical interpretability. Evaluated on a Reddit benchmark dataset, our approach achieves state-of-the-art accuracy in BDI-II scoring and supports DSM-5–compliant diagnostic reasoning tracing. This work establishes a novel paradigm for low-cost, high-fidelity digital mental health triage.

Technology Category

Application Category

📝 Abstract
In psychological practice, standardized questionnaires serve as essential tools for assessing mental constructs (e.g., attitudes, traits, and emotions) through structured questions (aka items). With the increasing prevalence of social media platforms where users share personal experiences and emotions, researchers are exploring computational methods to leverage this data for rapid mental health screening. In this study, we propose a novel adaptive Retrieval-Augmented Generation (RAG) approach that completes psychological questionnaires by analyzing social media posts. Our method retrieves the most relevant user posts for each question in a psychological survey and uses Large Language Models (LLMs) to predict questionnaire scores in a zero-shot setting. Our findings are twofold. First we demonstrate that this approach can effectively predict users' responses to psychological questionnaires, such as the Beck Depression Inventory II (BDI-II), achieving performance comparable to or surpassing state-of-the-art models on Reddit-based benchmark datasets without relying on training data. Second, we show how this methodology can be generalized as a scalable screening tool, as the final assessment is systematically derived by completing standardized questionnaires and tracking how individual item responses contribute to the diagnosis, aligning with established psychometric practices.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Mental Health Screening
Depression Detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive RAG
Psychological Assessment
Social Media Analysis
🔎 Similar Papers
No similar papers found.