Benchmarking the Reproducibility of Brain MRI Segmentation Across Scanners and Time

📅 2025-04-22

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This study addresses poor reproducibility in brain parcellation of multi-site, longitudinal structural MRI—particularly for small subcortical structures (e.g., amygdala, ventral diencephalon), where detecting subtle volume changes (5–10%) remains challenging. Method: Leveraging the SIMON (17-year follow-up) and SRPBS (9-center test–retest) datasets, we systematically evaluated FastSurfer and SynthSeg across scanners and timepoints, introducing a novel surface-geometry–based quality filtering strategy and analyzing the impact of registration templates and interpolation modes. Contribution/Results: We first quantified an inherent 7–8% volumetric variability in small structures—even under strict test–retest conditions. Comprehensive assessment employed Dice, Surface Dice, HD95, and MAPE. All code, preprocessing pipelines, and interactive visualization tools are open-sourced, establishing the first publicly available, fully reproducible benchmark for structural MRI segmentation reliability.

Technology Category

Application Category

📝 Abstract

Accurate and reproducible brain morphometry from structural MRI is critical for monitoring neuroanatomical changes across time and across imaging domains. Although deep learning has accelerated segmentation workflows, scanner-induced variability and reproducibility limitations remain-especially in longitudinal and multi-site settings. In this study, we benchmark two modern segmentation pipelines, FastSurfer and SynthSeg, both integrated into FreeSurfer, one of the most widely adopted tools in neuroimaging. Using two complementary datasets - a 17-year longitudinal cohort (SIMON) and a 9-site test-retest cohort (SRPBS)-we quantify inter-scan segmentation variability using Dice coefficient, Surface Dice, Hausdorff Distance (HD95), and Mean Absolute Percentage Error (MAPE). Our results reveal up to 7-8% volume variation in small subcortical structures such as the amygdala and ventral diencephalon, even under controlled test-retest conditions. This raises a key question: is it feasible to detect subtle longitudinal changes on the order of 5-10% in pea-sized brain regions, given the magnitude of domain-induced morphometric noise? We further analyze the effects of registration templates and interpolation modes, and propose surface-based quality filtering to improve segmentation reliability. This study provides a reproducible benchmark for morphometric reproducibility and emphasizes the need for harmonization strategies in real-world neuroimaging studies. Code and figures: https://github.com/kondratevakate/brain-mri-segmentation

Problem

Research questions and friction points this paper is trying to address.

Assessing reproducibility of brain MRI segmentation across scanners and time

Evaluating segmentation variability in longitudinal and multi-site MRI studies

Proposing methods to improve reliability of brain morphometric measurements

Innovation

Methods, ideas, or system contributions that make the work stand out.

Benchmark FastSurfer and SynthSeg pipelines

Analyze registration templates and interpolation

Propose surface-based quality filtering

🔎 Similar Papers

No similar papers found.