Breaking Data Efficiency Dilemma: A Federated and Augmented Learning Framework For Alzheimer's Disease Detection via Speech

📅 2026-02-16

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

This study addresses the challenge of data inefficiency in Alzheimer’s disease (AD) detection from speech, which stems from the scarcity of medical data and stringent privacy constraints. To overcome these limitations, the authors propose a novel framework that integrates speech content recombination-based augmentation, adaptive federated learning, and an attention-driven cross-modal alignment mechanism between acoustic and textual representations. This approach enables privacy-preserving collaboration across institutions while significantly enhancing data efficiency. The method achieves breakthroughs in absolute data efficiency, collaborative training efficiency, and representation learning efficiency, attaining a multimodal accuracy of 91.52% on the ADReSSo dataset—substantially outperforming existing centralized baselines.

Technology Category

Application Category

📝 Abstract

Early diagnosis of Alzheimer's Disease (AD) is crucial for delaying its progression. While AI-based speech detection is non-invasive and cost-effective, it faces a critical data efficiency dilemma due to medical data scarcity and privacy barriers. Therefore, we propose FAL-AD, a novel framework that synergistically integrates federated learning with data augmentation to systematically optimize data efficiency. Our approach delivers three key breakthroughs: First, absolute efficiency improvement through voice conversion-based augmentation, which generates diverse pathological speech samples via cross-category voice-content recombination. Second, collaborative efficiency breakthrough via an adaptive federated learning paradigm, maximizing cross-institutional benefits under privacy constraints. Finally, representational efficiency optimization by an attentive cross-modal fusion model, which achieves fine-grained word-level alignment and acoustic-textual interaction. Evaluated on ADReSSo, FAL-AD achieves a state-of-the-art multi-modal accuracy of 91.52%, outperforming all centralized baselines and demonstrating a practical solution to the data efficiency dilemma. Our source code is publicly available at https://github.com/smileix/fal-ad.

Problem

Research questions and friction points this paper is trying to address.

Alzheimer's Disease

data efficiency

speech-based detection

medical data scarcity

privacy constraints

Innovation

Methods, ideas, or system contributions that make the work stand out.

federated learning

data augmentation

voice conversion

cross-modal fusion

Alzheimer's disease detection

🔎 Similar Papers

No similar papers found.