🤖 AI Summary
This study addresses molecular subtyping of gastrointestinal cancer histopathological images into microsatellite instability (MSI) and microsatellite stability (MSS) classes, balancing model accuracy, data privacy preservation, and robustness against membership inference and model extraction attacks. We propose DP-NF-Net—a novel integration of differential privacy (DP) with the lightweight, high-efficiency NF-Net architecture—and introduce an adaptive DP-AdamW optimizer. We systematically evaluate the synergistic efficacy of weighted random sampling (WRS) and class weighting (CW) in mitigating dataset imbalance. Under a stringent privacy budget of ε = 8, our adaptive DP-AdamW achieves 76.48% classification accuracy—significantly outperforming baseline DP methods and incurring only a 12.5-percentage-point drop relative to the non-private counterpart (88.98%), thus preserving clinical utility alongside strong privacy guarantees. Key contributions include: (1) the DP-NF-Net framework; (2) the adaptive DP-AdamW optimizer; and (3) empirical validation of the privacy–performance trade-off specifically for digital pathology.
📝 Abstract
Based on global genomic status, the cancer tumor is classified as Microsatellite Instable (MSI) and Microsatellite Stable (MSS). Immunotherapy is used to diagnose MSI, whereas radiation and chemotherapy are used for MSS. Therefore, it is significant to classify a gastro-intestinal (GI) cancer tumor into MSI vs. MSS to provide appropriate treatment. The existing literature showed that deep learning could directly predict the class of GI cancer tumors from histological images. However, deep learning (DL) models are susceptible to various threats, including membership inference attacks, model extraction attacks, etc. These attacks render the use of DL models impractical in real-world scenarios. To make the DL models useful and maintain privacy, we integrate differential privacy (DP) with DL. In particular, this paper aims to predict the state of GI cancer while preserving the privacy of sensitive data. We fine-tuned the Normalizer Free Net (NF-Net) model. We obtained an accuracy of 88.98% without DP to predict (GI) cancer status. When we fine-tuned the NF-Net using DP-AdamW and adaptive DP-AdamW, we got accuracies of 74.58% and 76.48%, respectively. Moreover, we investigate the Weighted Random Sampler (WRS) and Class weighting (CW) to solve the data imbalance. We also evaluated and analyzed the DP algorithms in different settings.