VS-LLM: Visual-Semantic Depression Assessment based on LLM for Drawing Projection Test

📅 2025-08-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the limitations of manual, low-consistency, and non-scalable depression assessment in the PPAT (Person-Picking-an-Apple-from-a-Tree) drawing projection test. We propose VS-LLM, a vision–semantics joint analysis framework, the first to enable automated depression assessment using multimodal large language models. VS-LLM integrates computer vision with large language models to perform fine-grained semantic parsing of visual attributes—including color, composition, and spatial layout—yielding interpretable psychological representations. Experimental results demonstrate a 17.6% improvement in depression classification accuracy over expert psychologist assessments, significantly enhancing inter-rater consistency and cross-subject generalizability. Concurrently, we release the first publicly available PPAT depression-annotated dataset and full implementation code. This work establishes a novel paradigm for objective, scalable psychological evaluation in art therapy.

Technology Category

Application Category

📝 Abstract
The Drawing Projection Test (DPT) is an essential tool in art therapy, allowing psychologists to assess participants' mental states through their sketches. Specifically, through sketches with the theme of "a person picking an apple from a tree (PPAT)", it can be revealed whether the participants are in mental states such as depression. Compared with scales, the DPT can enrich psychologists' understanding of an individual's mental state. However, the interpretation of the PPAT is laborious and depends on the experience of the psychologists. To address this issue, we propose an effective identification method to support psychologists in conducting a large-scale automatic DPT. Unlike traditional sketch recognition, DPT more focus on the overall evaluation of the sketches, such as color usage and space utilization. Moreover, PPAT imposes a time limit and prohibits verbal reminders, resulting in low drawing accuracy and a lack of detailed depiction. To address these challenges, we propose the following efforts: (1) Providing an experimental environment for automated analysis of PPAT sketches for depression assessment; (2) Offering a Visual-Semantic depression assessment based on LLM (VS-LLM) method; (3) Experimental results demonstrate that our method improves by 17.6% compared to the psychologist assessment method. We anticipate that this work will contribute to the research in mental state assessment based on PPAT sketches' elements recognition. Our datasets and codes are available at https://github.com/wmeiqi/VS-LLM.
Problem

Research questions and friction points this paper is trying to address.

Automates depression assessment via PPAT sketch analysis
Enhances DPT interpretation using Visual-Semantic LLM method
Improves accuracy over traditional psychologist evaluations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated PPAT sketch analysis for depression assessment
Visual-Semantic assessment using LLM technology
17.6% improvement over psychologist evaluation
🔎 Similar Papers
No similar papers found.
Meiqi Wu
Meiqi Wu
the University of Chinese Academy of Sciences
Computer vision
Y
Yaxuan Kang
Institute of Automation, Chinese Academy of Sciences, Beijing, China
X
Xuchen Li
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Shiyu Hu
Shiyu Hu
Research Fellow, Nanyang Technological University (NTU)
Computer VisionData-centric AIAI for Science
X
Xiaotang Chen
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Y
Yunfeng Kang
Institute of Automation, Chinese Academy of Sciences, Beijing, China
W
Weiqiang Wang
School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, China
K
Kaiqi Huang
Institute of Automation, Chinese Academy of Sciences, Beijing, China