🤖 AI Summary
This study addresses the challenge of modeling the multifaceted linguistic impairments in primary progressive aphasia (PPA), which existing speech synthesis approaches struggle to capture comprehensively due to limited clinical data. The authors propose HASS, a clinically driven, hierarchical framework for simulating aphasic speech that, for the first time, integrates the semantic, phonological, and temporal deficits characteristic of the logopenic variant of PPA (lvPPA) within a unified architecture. Leveraging expert-annotated impairment patterns, HASS enables controllable synthesis of speech with adjustable severity levels. Speech samples generated by this framework substantially improve the accuracy and generalization of automatic PPA detection models, demonstrating its effectiveness and clinical fidelity in data-scarce medical settings.
📝 Abstract
Building a diagnosis model for primary progressive aphasia (PPA) has been challenging due to the data scarcity. Collecting clinical data at scale is limited by the high vulnerability of clinical population and the high cost of expert labeling. To circumvent this, previous studies simulate dysfluent speech to generate training data. However, those approaches are not comprehensive enough to simulate PPA as holistic, multi-level phenotypes, instead relying on isolated dysfluencies. To address this, we propose a novel, clinically grounded simulation framework, Hierarchical Aphasic Speech Simulation (HASS). HASS aims to simulate behaviors of logopenic variant of PPA (lvPPA) with varying degrees of severity. To this end, semantic, phonological, and temporal deficits of lvPPA are systematically identified by clinical experts, and simulated. We demonstrate that our framework enables more accurate and generalizable detection models.