🤖 AI Summary
This study addresses the challenge of data inefficiency in Alzheimer’s disease (AD) detection from speech, which stems from the scarcity of medical data and stringent privacy constraints. To overcome these limitations, the authors propose a novel framework that integrates speech content recombination-based augmentation, adaptive federated learning, and an attention-driven cross-modal alignment mechanism between acoustic and textual representations. This approach enables privacy-preserving collaboration across institutions while significantly enhancing data efficiency. The method achieves breakthroughs in absolute data efficiency, collaborative training efficiency, and representation learning efficiency, attaining a multimodal accuracy of 91.52% on the ADReSSo dataset—substantially outperforming existing centralized baselines.
📝 Abstract
Early diagnosis of Alzheimer's Disease (AD) is crucial for delaying its progression. While AI-based speech detection is non-invasive and cost-effective, it faces a critical data efficiency dilemma due to medical data scarcity and privacy barriers. Therefore, we propose FAL-AD, a novel framework that synergistically integrates federated learning with data augmentation to systematically optimize data efficiency. Our approach delivers three key breakthroughs: First, absolute efficiency improvement through voice conversion-based augmentation, which generates diverse pathological speech samples via cross-category voice-content recombination. Second, collaborative efficiency breakthrough via an adaptive federated learning paradigm, maximizing cross-institutional benefits under privacy constraints. Finally, representational efficiency optimization by an attentive cross-modal fusion model, which achieves fine-grained word-level alignment and acoustic-textual interaction. Evaluated on ADReSSo, FAL-AD achieves a state-of-the-art multi-modal accuracy of 91.52%, outperforming all centralized baselines and demonstrating a practical solution to the data efficiency dilemma. Our source code is publicly available at https://github.com/smileix/fal-ad.