Reading Between the Lines: Combining Pause Dynamics and Semantic Coherence for Automated Assessment of Thought Disorder

📅 2025-07-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional clinical rating scales for assessing formal thought disorder (FTD) in schizophrenia-spectrum disorders are resource-intensive and difficult to scale. Existing automated speech analysis approaches predominantly rely on single-modality features and fail to jointly model temporal dynamics (e.g., pause patterns) and semantic coherence. To address this, we propose the first systematic multimodal framework integrating ASR-derived pause dynamics—such as pause frequency and duration distribution—with semantic coherence metrics computed via pretrained language models. The framework is designed to be robust across diverse clinical contexts. We employ support vector regression (SVR) for feature fusion and evaluation on the TOPSY dataset yields a correlation coefficient of ρ = 0.649 for FTD severity prediction and an AUC of 83.71% for detecting severe cases—both significantly outperforming unimodal baselines. This work establishes a scalable, objective, and quantitatively grounded methodology for automated assessment of speech disorganization in psychosis.

Technology Category

Application Category

📝 Abstract
Formal thought disorder (FTD), a hallmark of schizophrenia spectrum disorders, manifests as incoherent speech and poses challenges for clinical assessment. Traditional clinical rating scales, though validated, are resource-intensive and lack scalability. Automated speech analysis with automatic speech recognition (ASR) allows for objective quantification of linguistic and temporal features of speech, offering scalable alternatives. The use of utterance timestamps in ASR captures pause dynamics, which are thought to reflect the cognitive processes underlying speech production. However, the utility of integrating these ASR-derived features for assessing FTD severity requires further evaluation. This study integrates pause features with semantic coherence metrics across three datasets: naturalistic self-recorded diaries (AVH, n = 140), structured picture descriptions (TOPSY, n = 72), and dream narratives (PsyCL, n = 43). We evaluated pause related features alongside established coherence measures, using support vector regression (SVR) to predict clinical FTD scores. Key findings demonstrate that pause features alone robustly predict the severity of FTD. Integrating pause features with semantic coherence metrics enhanced predictive performance compared to semantic-only models, with integration of independent models achieving correlations up to {ho} = 0.649 and AUC = 83.71% for severe cases detection (TOPSY, with best {ho} = 0.584 and AUC = 79.23% for semantic-only models). The performance gains from semantic and pause features integration held consistently across all contexts, though the nature of pause patterns was dataset-dependent. These findings suggest that frameworks combining temporal and semantic analyses provide a roadmap for refining the assessment of disorganized speech and advance automated speech analysis in psychosis.
Problem

Research questions and friction points this paper is trying to address.

Assessing thought disorder severity using automated speech analysis
Integrating pause dynamics with semantic coherence for FTD evaluation
Enhancing predictive performance of FTD detection across diverse datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combining pause dynamics with semantic coherence metrics
Using support vector regression for FTD prediction
Integrating ASR-derived pause and semantic features
🔎 Similar Papers
No similar papers found.
F
Feng Chen
Department of Biomedical Informatics and Health Education, University of Washington, Seattle, WA, USA
Weizhe Xu
Weizhe Xu
PhD student, University of Washington
NLPAIMedicine
C
Changye Li
Department of Biomedical Informatics and Health Education, University of Washington, Seattle, WA, USA
Serguei Pakhomov
Serguei Pakhomov
University of Minnesota
computational linguisticshealth informaticscognitive sciencenatural language processingautomatic speech recognition
A
Alex Cohen
Department of Psychology, Louisiana State University, Baton Rouge, LA, USA; Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, USA
S
Simran Bhola
Northwell Health, New Hyde Park, NY, USA
S
Sandy Yin
Northwell Health, New Hyde Park, NY, USA
S
Sunny X Tang
Northwell Health, New Hyde Park, NY, USA; Feinstein Institutes for Medical Research, Institute of Behavioral Science, Manhasset, NY, USA; Donald and Barbara Zucker School of Medicine, Department of Psychiatry, Hempstead, NY, USA
M
Michael Mackinley
London Health Sciences Centre, Canada
L
Lena Palaniyappan
Douglas Mental Health University Institute, Department of Psychiatry, McGill University, Montreal, Quebec, Canada; Robarts Research Institute, Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada
Dror Ben-Zeev
Dror Ben-Zeev
Professor of Psychiatry and Behavioral Sciences, University of Washington
mHealtheHealthschizophreniaserious mental illnesstreatment
Trevor Cohen
Trevor Cohen
University of Washington
Distributional SemanticsComputational LinguisticsBiomedical InformaticsInformation RetrievalLIterature-based Discovery