Uncertainty Modeling in Multimodal Speech Analysis Across the Psychosis Spectrum

📅 2025-02-25

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

Speech disruptions in psychosis exhibit high phenotypic variability and poor predictive robustness. Method: We propose an uncertainty-aware multimodal speech analysis model—introducing modality-level Bayesian uncertainty quantification to psychosis speech modeling for the first time. It fuses acoustic and linguistic features and employs a task-adaptive gating mechanism to dynamically weight modality contributions, enabling calibrated predictions across both structured and unstructured spoken tasks. Contributions/Results: (1) A novel, interpretable mechanism for dynamic feature weighting; (2) significantly improved cross-task generalization; (3) on 114 participants, it achieves reduced RMSE, F1-score of 83%, and expected calibration error (ECE) of 0.045; (4) reliably identifies validated biomarkers—including pitch variability and fluency disruptions—supporting early detection and personalized clinical assessment.

Technology Category

Application Category

📝 Abstract

Capturing subtle speech disruptions across the psychosis spectrum is challenging because of the inherent variability in speech patterns. This variability reflects individual differences and the fluctuating nature of symptoms in both clinical and non-clinical populations. Accounting for uncertainty in speech data is essential for predicting symptom severity and improving diagnostic precision. Speech disruptions characteristic of psychosis appear across the spectrum, including in non-clinical individuals. We develop an uncertainty-aware model integrating acoustic and linguistic features to predict symptom severity and psychosis-related traits. Quantifying uncertainty in specific modalities allows the model to address speech variability, improving prediction accuracy. We analyzed speech data from 114 participants, including 32 individuals with early psychosis and 82 with low or high schizotypy, collected through structured interviews, semi-structured autobiographical tasks, and narrative-driven interactions in German. The model improved prediction accuracy, reducing RMSE and achieving an F1-score of 83% with ECE = 4.5e-2, showing robust performance across different interaction contexts. Uncertainty estimation improved model interpretability by identifying reliability differences in speech markers such as pitch variability, fluency disruptions, and spectral instability. The model dynamically adjusted to task structures, weighting acoustic features more in structured settings and linguistic features in unstructured contexts. This approach strengthens early detection, personalized assessment, and clinical decision-making in psychosis-spectrum research.

Problem

Research questions and friction points this paper is trying to address.

Modeling speech uncertainty in psychosis

Improving symptom severity prediction accuracy

Enhancing diagnostic precision through multimodal analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uncertainty-aware multimodal speech analysis

Acoustic and linguistic feature integration

Dynamic adjustment to task structures

🔎 Similar Papers

Multimodal Machine Learning in Mental Health: A Survey of Data, Algorithms, and Challenges