🤖 AI Summary
This study addresses the challenge of brain tumor localization hindered by data silos and privacy regulations that restrict access to large-scale, multi-institutional medical datasets. To overcome this, the authors propose the first hybrid architecture integrating federated learning with a Transformer–graph neural network (GNN), enabling collaborative model training across institutions without sharing raw patient data. The framework incorporates an attention mechanism to provide modality-level interpretability. Evaluated on the BraTS dataset, the model achieves performance comparable to centralized training. Statistical analysis using Bonferroni-corrected t-tests (p < 0.001, Cohen’s d = 1.50) demonstrates that deeper network layers significantly enhance attention to T2/FLAIR, the clinically critical imaging modalities, thereby validating both the method’s efficacy and its clinical relevance.
📝 Abstract
Deep learning models for brain tumor analysis require large and diverse datasets that are often siloed across healthcare institutions due to privacy regulations. We present a federated learning framework for brain tumor localization that enables multi-institutional collaboration without sharing sensitive patient data. Our method extends a hybrid Transformer-Graph Neural Network architecture derived from prior decoder-free supervoxel GNNs and is deployed within CAFEIN\textsuperscript{\textregistered}, CERN's federated learning platform designed for healthcare environments. We provide an explainability analysis through Transformer attention mechanisms that reveals which MRI modalities drive the model predictions. Experiments on the BraTS dataset demonstrate a key finding: while isolated training on individual client data triggers early stopping well before reaching full training capacity, federated learning enables continued model improvement by leveraging distributed data, ultimately matching centralized performance. This result provides strong justification for federated learning when dealing with complex tasks and high-dimensional input data, as aggregating knowledge from multiple institutions significantly benefits the learning process. Our explainability analysis, validated through rigorous statistical testing on the full test set (paired t-tests with Bonferroni correction), reveals that deeper network layers significantly increase attention to T2 and FLAIR modalities ($p<0.001$, Cohen's $d$=1.50), aligning with clinical practice.