🤖 AI Summary
This study addresses the challenge of early identification of depression and anxiety among adolescents. We propose a deep learning–based screening method leveraging speech and textual data from teacher–student dialogues. Using 16,000 manually annotated audio samples, we integrate automatic speech recognition (ASR) transcripts with pre-trained language models (e.g., RoBERTa) within a transfer learning framework to build three binary classifiers—depression, anxiety, and comorbid depression–anxiety. Our key contribution is the first empirical finding that word-order features exhibit significantly higher discriminative power for depression detection than for anxiety, revealing fundamental differences in their linguistic representations. Under controlled class imbalance, the comorbidity classifier achieves the highest performance (AUC = 0.86), while depression- and anxiety-specific classifiers attain AUCs of 0.82 and 0.79, respectively. This approach establishes an interpretable, deployable paradigm for unobtrusive, large-scale digital behavioral health screening in school settings.
📝 Abstract
Digital screening and monitoring applications can aid providers in the management of behavioral health conditions. We explore deep language models for detecting depression, anxiety, and their comorbidity using input from conversational speech. Speech data comprise 16k spoken interactions labeled for both depression and anxiety. We find that results for binary classification range from 0.86 to 0.79 AUC, depending on condition and comorbidity. Best performance occurs for comorbid cases. We show that this result is not attributable to data skew. Finally, we find evidence suggesting that underlying word sequence cues may be more salient for depression than for anxiety.