Can Large Language Models Imitate Human Speech for Clinical Assessment? LLM-Driven Data Augmentation for Cognitive Score Prediction

📅 2026-05-15
📈 Citations: 0
Influential: 0
📄 PDF

career value

189K/year
🤖 AI Summary
This study addresses the challenge of limited and class-imbalanced data in clinical spoken-language assessments for cognitive decline prediction. The authors propose a semantics-guided large language model (LLM) data augmentation approach that leverages written transcripts as semantic anchors to generate diverse spoken monologues using GPT-5, followed by extracting speech embeddings via Sentence-BERT. A similarity-guided class-balancing strategy is further introduced to prioritize semantically proximate synthetic samples during training. Evaluated on the Hasegawa Dementia Scale scoring task, the method significantly reduces prediction error for the underrepresented low-score group while preserving performance on the majority class, demonstrating the efficacy and robustness of semantics-controlled LLM-based augmentation in clinical speech analysis.
📝 Abstract
Accurate assessment of cognitive decline from spontaneous speech remains challenging due to limited dataset size and class imbalance. In this work, we propose a large language model (LLM)-driven data augmentation framework to improve the prediction of cognitive scores from speech. Experiments are conducted on a Japanese corpus in which each participant provides both a spontaneous oral narrative and a written response to the same clinical prompt. The written responses serve as semantic anchors to generate multiple oral-like monologues in different styles using GPT-5. We then predict Hasegawa Dementia Scale scores, a widely used cognitive screening tool in Japan, using a Partial Least Squares regression model trained on Sentence-BERT speech embeddings. We investigate two augmentation strategies: random class-balanced selection, which yields moderate but unstable improvements, and similarity-guided class-balanced selection. The latter prioritizes semantically close synthetic samples, leading to more consistent improvements and substantially reducing prediction error for minority low-score participants while maintaining performance for the majority group. Overall, our findings demonstrate the potential of semantically guided LLM-driven augmentation as a principled approach for addressing class imbalance and improving data efficiency in clinical speech analysis.
Problem

Research questions and friction points this paper is trying to address.

cognitive decline
spontaneous speech
class imbalance
data augmentation
clinical assessment
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-driven data augmentation
semantic-guided sampling
cognitive score prediction
class imbalance
clinical speech analysis
🔎 Similar Papers
No similar papers found.