Can Large Language Models Imitate Human Speech for Clinical Assessment? LLM-Driven Data Augmentation for Cognitive Score Prediction

📅 2026-05-15

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

This study addresses the challenge of limited and class-imbalanced data in clinical spoken-language assessments for cognitive decline prediction. The authors propose a semantics-guided large language model (LLM) data augmentation approach that leverages written transcripts as semantic anchors to generate diverse spoken monologues using GPT-5, followed by extracting speech embeddings via Sentence-BERT. A similarity-guided class-balancing strategy is further introduced to prioritize semantically proximate synthetic samples during training. Evaluated on the Hasegawa Dementia Scale scoring task, the method significantly reduces prediction error for the underrepresented low-score group while preserving performance on the majority class, demonstrating the efficacy and robustness of semantics-controlled LLM-based augmentation in clinical speech analysis.

📝 Abstract

Accurate assessment of cognitive decline from spontaneous speech remains challenging due to limited dataset size and class imbalance. In this work, we propose a large language model (LLM)-driven data augmentation framework to improve the prediction of cognitive scores from speech. Experiments are conducted on a Japanese corpus in which each participant provides both a spontaneous oral narrative and a written response to the same clinical prompt. The written responses serve as semantic anchors to generate multiple oral-like monologues in different styles using GPT-5. We then predict Hasegawa Dementia Scale scores, a widely used cognitive screening tool in Japan, using a Partial Least Squares regression model trained on Sentence-BERT speech embeddings. We investigate two augmentation strategies: random class-balanced selection, which yields moderate but unstable improvements, and similarity-guided class-balanced selection. The latter prioritizes semantically close synthetic samples, leading to more consistent improvements and substantially reducing prediction error for minority low-score participants while maintaining performance for the majority group. Overall, our findings demonstrate the potential of semantically guided LLM-driven augmentation as a principled approach for addressing class imbalance and improving data efficiency in clinical speech analysis.

Problem

Research questions and friction points this paper is trying to address.

cognitive decline

spontaneous speech

class imbalance

data augmentation

clinical assessment

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-driven data augmentation

semantic-guided sampling

cognitive score prediction