π€ AI Summary
This work proposes Robust-ComBat, a novel extension of the widely used ComBat harmonization framework for diffusion MRI (dMRI) data, designed to address the limitations of conventional Gaussian-based approaches in cohorts with high proportions of neurological patients. Traditional methods are prone to bias from pathological outliers, leading to inaccurate estimation of site effects. To mitigate this, Robust-ComBat integrates a lightweight multilayer perceptron (MLP) into the ComBat architecture, enabling robust modeling of pathological outliers without discarding potentially informative cases. Evaluated on multicenter datasets containing up to 80% patient samples, the method significantly reduces harmonization error compared to existing statistical baselines while preserving disease-related biological signals, demonstrating its effectiveness and clinical relevance in real-world heterogeneous cohorts.
π Abstract
Harmonization methods such as ComBat and its variants are widely used to mitigate diffusion MRI (dMRI) site-specific biases. However, ComBat assumes that subject distributions exhibit a Gaussian profile. In practice, patients with neurological disorders often present diffusion metrics that deviate markedly from those of healthy controls, introducing pathological outliers that distort site-effect estimation. This problem is particularly challenging in clinical practice as most patients undergoing brain imaging have an underlying and yet undiagnosed condition, making it difficult to exclude them from harmonization cohorts, as their scans were precisely prescribed to establish a diagnosis. In this paper, we show that harmonizing data to a normative reference population with ComBat while including pathological cases induces significant distortions. Across 7 neurological conditions, we evaluated 10 outlier rejection methods with 4 ComBat variants over a wide range of scenarios, revealing that many filtering strategies fail in the presence of pathology. In contrast, a simple MLP provides robust outlier compensation enabling reliable harmonization while preserving disease-related signal. Experiments on both control and real multi-site cohorts, comprising up to 80% of subjects with neurological disorders, demonstrate that Robust-ComBat consistently outperforms conventional statistical baselines with lower harmonization error across all ComBat variants.