Simple Image Processing and Similarity Measures Can Link Data Samples across Databases through Brain MRI

📅 2026-02-10

📈 Citations: 0

✨ Influential: 0

career value

230K/year

🤖 AI Summary

This study addresses the privacy risk that de-identified brain MRI scans can still be re-identified across databases due to unique individual anatomical features. The authors propose a lightweight, training-free method that relies solely on standard preprocessing steps—such as skull-stripping and alignment of T1-weighted images—and image similarity metrics to achieve high-accuracy cross-dataset matching, even across varying acquisition times, scanners, and protocols. Experimental results demonstrate near-perfect matching accuracy under diverse scanning conditions, providing the first evidence that straightforward image processing alone is sufficient to effectively link de-identified neuroimaging data. These findings reveal a previously underestimated privacy vulnerability in current neuroimaging data-sharing policies.

Technology Category

Application Category

📝 Abstract

Head Magnetic Resonance Imaging (MRI) is routinely collected and shared for research under strict regulatory frameworks. These frameworks require removing potential identifiers before sharing. But, even after skull stripping, the brain parenchyma contains unique signatures that can match other MRIs from the same participants across databases, posing a privacy risk if additional data features are available. Current regulatory frameworks often mandate evaluating such risks based on the assessment of a certain level of reasonableness. Prior studies have already suggested that a brain MRI could enable participant linkage, but they have relied on training-based or computationally intensive methods. Here, we demonstrate that linking an individual's skull-stripped T1-weighted MRI, which may lead to re-identification if other identifiers are available, is possible using standard preprocessing followed by image similarity computation. Nearly perfect linkage accuracy was achieved in matching data samples across various time intervals, scanner types, spatial resolutions, and acquisition protocols, despite potential cognitive decline, simulating MRI matching across databases. These results aim to contribute meaningfully to the development of thoughtful, forward-looking policies in medical data sharing.

Problem

Research questions and friction points this paper is trying to address.

privacy risk

brain MRI

de-identification

data linkage

medical data sharing

Innovation

Methods, ideas, or system contributions that make the work stand out.

image similarity

brain MRI re-identification

skull-stripped MRI