🤖 AI Summary
This study addresses the need for early depression screening by proposing a dialogue-based unsupervised assessment method that eliminates reliance on manual annotation. Methodologically, it introduces a prompt template explicitly aligned with the Beck Depression Inventory-II (BDI-II) clinical criteria to guide large language models (LLMs) in performing structured psychological assessments; JSON-formatted outputs, cross-model consistency analysis, and intra-response logical validation replace conventional annotation, while a temporal attention mechanism extracts salient conversational cues. The key contribution is the first clinical-aligned prompting framework grounded in multi-model collaborative verification, unifying interpretability and reliability in assessment. Evaluated on the DCHR benchmark, the system ranks second on the official leaderboard, achieving DCHR=0.50, ADODL=0.89, and ASHR=0.27.
📝 Abstract
This Working Note summarizes the participation of the DS@GT team in two eRisk 2025 challenges. For the Pilot Task on conversational depression detection with large language-models (LLMs), we adopted a prompt-engineering strategy in which diverse LLMs conducted BDI-II-based assessments and produced structured JSON outputs. Because ground-truth labels were unavailable, we evaluated cross-model agreement and internal consistency. Our prompt design methodology aligned model outputs with BDI-II criteria and enabled the analysis of conversational cues that influenced the prediction of symptoms. Our best submission, second on the official leaderboard, achieved DCHR = 0.50, ADODL = 0.89, and ASHR = 0.27.