Semantics versus Identity: A Divide-and-Conquer Approach towards Adjustable Medical Image De-Identification

📅 2025-07-25

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

Existing medical image de-identification methods struggle to simultaneously ensure strong privacy protection and preserve diagnostic semantic fidelity, while lacking adjustable privacy granularity. To address this, we propose a “divide-and-conquer” anonymization framework: first, a tunable-ratio identity-region masking mechanism enables fine-grained, controllable privacy enforcement; second, semantic features are extracted using a pre-trained medical foundation model, and a Minimum Description Length (MDL)-guided feature disentanglement strategy explicitly separates identity-related information from diagnostic semantics. This achieves effective decoupling of privacy-critical and clinically relevant attributes. Our method supports continuous, fine-grained privacy-level adjustment. Extensive evaluation across seven diverse medical imaging datasets and three downstream diagnostic tasks demonstrates consistent and significant improvements over state-of-the-art approaches, achieving the optimal trade-off between de-identification strength and diagnostic utility.

Technology Category

Application Category

📝 Abstract

Medical imaging has significantly advanced computer-aided diagnosis, yet its re-identification (ReID) risks raise critical privacy concerns, calling for de-identification (DeID) techniques. Unfortunately, existing DeID methods neither particularly preserve medical semantics, nor are flexibly adjustable towards different privacy levels. To address these issues, we propose a divide-and-conquer framework comprising two steps: (1) Identity-Blocking, which blocks varying proportions of identity-related regions, to achieve different privacy levels; and (2) Medical-Semantics-Compensation, which leverages pre-trained Medical Foundation Models (MFMs) to extract medical semantic features to compensate the blocked regions. Moreover, recognizing that features from MFMs may still contain residual identity information, we introduce a Minimum Description Length principle-based feature decoupling strategy, to effectively decouple and discard such identity components. Extensive evaluations against existing approaches across seven datasets and three downstream tasks, demonstrates our state-of-the-art performance.

Problem

Research questions and friction points this paper is trying to address.

Balancing medical image privacy and semantic preservation

Flexible de-identification for adjustable privacy levels

Decoupling residual identity from medical semantic features

Innovation

Methods, ideas, or system contributions that make the work stand out.

Divide-and-conquer framework for adjustable de-identification

Identity-Blocking with variable privacy levels

Medical-Semantics-Compensation using pre-trained MFMs

🔎 Similar Papers

No similar papers found.