FedCVD++: Communication-Efficient Federated Learning for Cardiovascular Risk Prediction with Parametric and Non-Parametric Model Optimization

📅 2025-07-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses key challenges in applying federated learning (FL) to cardiovascular disease risk prediction: stringent privacy requirements, high communication overhead, and severe inter-institutional class imbalance. To this end, it pioneers the efficient integration of nonparametric models—specifically random forests and XGBoost—into a medical FL framework. Three core innovations are proposed: (1) tree subset sampling to drastically reduce model transmission costs; (2) lightweight, XGBoost-based feature extraction enabling effective cross-institutional knowledge transfer; and (3) a synchronized federated SMOTE mechanism to mitigate local data imbalance. Evaluated on the Framingham Heart Study dataset, the federated XGBoost achieves an F1-score of 0.80—surpassing centralized training—while federated random forest attains 0.81, matching local training performance. Communication overhead is reduced by 3.2×, accuracy remains at 95%, and F1 improves by up to 15%. This work establishes a new paradigm for privacy-preserving, efficient, and scalable distributed medical prediction.

Technology Category

Application Category

📝 Abstract
Cardiovascular diseases (CVD) cause over 17 million deaths annually worldwide, highlighting the urgent need for privacy-preserving predictive systems. We introduce FedCVD++, an enhanced federated learning (FL) framework that integrates both parametric models (logistic regression, SVM, neural networks) and non-parametric models (Random Forest, XGBoost) for coronary heart disease risk prediction. To address key FL challenges, we propose: (1) tree-subset sampling that reduces Random Forest communication overhead by 70%, (2) XGBoost-based feature extraction enabling lightweight federated ensembles, and (3) federated SMOTE synchronization for resolving cross-institutional class imbalance. Evaluated on the Framingham dataset (4,238 records), FedCVD++ achieves state-of-the-art results: federated XGBoost (F1 = 0.80) surpasses its centralized counterpart (F1 = 0.78), and federated Random Forest (F1 = 0.81) matches non-federated performance. Additionally, our communication-efficient strategies reduce bandwidth consumption by 3.2X while preserving 95% accuracy. Compared to existing FL frameworks, FedCVD++ delivers up to 15% higher F1-scores and superior scalability for multi-institutional deployment. This work represents the first practical integration of non-parametric models into federated healthcare systems, providing a privacy-preserving solution validated under real-world clinical constraints.
Problem

Research questions and friction points this paper is trying to address.

Enhancing federated learning for CVD risk prediction with mixed models
Reducing communication overhead in federated Random Forest by 70%
Addressing class imbalance in cross-institutional federated learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Tree-subset sampling reduces communication overhead
XGBoost-based feature extraction enables lightweight ensembles
Federated SMOTE synchronization resolves class imbalance
🔎 Similar Papers
No similar papers found.
A
Abdelrhman Gaber
CSE Department, The American University in Cairo, Street, New Cairo, 11835, Cairo, Egypt.
H
Hassan Abd-Eltawab
CSE Department, The American University in Cairo, Street, New Cairo, 11835, Cairo, Egypt.
J
John Elgallab
CSE Department, The American University in Cairo, Street, New Cairo, 11835, Cairo, Egypt.
Y
Youssif Abuzied
CSE Department, The American University in Cairo, Street, New Cairo, 11835, Cairo, Egypt.
D
Dineo Mpanya
CSE Department, University of the Witwatersrand, Street, Johannesburg, 2000, Gauteng, South Africa.
Turgay Celik
Turgay Celik
Unknown affiliation
arficial intelligencecomputer visioncybersecurity(health) data scienceremote sensing
Swarun Kumar
Swarun Kumar
Sathaye Family Foundation Professor, CMU
networkswirelesssystemssecuritycommunication systems
Tamer ElBatt
Tamer ElBatt
Professor, Wireless Networks and Mobile Computing, The American University in Cairo
Wireless and Mobile NetworksModelingPerformance AnalysisOptimizationIoT