đ¤ AI Summary
This work proposes a novel, stealthy watermarking method for medical image segmentation models that simultaneously ensures imperceptibility, harmlessness, and black-box verifiabilityâcapabilities lacking in existing ownership protection schemes. By leveraging uncertainty-guided backdoors, the approach modulates prediction uncertainty and integrates model-agnostic explanation techniques (e.g., LIME) to extract feature attributions, thereby embedding identifiable QR code watermarks without altering segmentation outputs. To the best of our knowledge, this is the first method to introduce harmless, covert watermarks into medical segmentation models, enabling efficient ownership verification in black-box settings. Extensive experiments across four medical datasets and five state-of-the-art modelsâincluding SAMâdemonstrate over 95% watermark verification success rates, with negligible performance degradation (<1% drop in both Dice and AUC), significantly outperforming current backdoor-based watermarking approaches.
đ Abstract
Annotating medical data for training AI models is often costly and limited due to the shortage of specialists with relevant clinical expertise. This challenge is further compounded by privacy and ethical concerns associated with sensitive patient information. As a result, well-trained medical segmentation models on private datasets constitute valuable intellectual property requiring robust protection mechanisms. Existing model protection techniques primarily focus on classification and generative tasks, while segmentation modelsâcrucial to medical image analysisâremain largely underexplored. In this paper, we propose a novel, stealthy, and harmless method, StealthMark, for verifying the ownership of medical segmentation models under closed-box conditions. Our approach subtly modulates model uncertainty without altering the final segmentation outputs, thereby preserving the modelâs performance. To enable ownership verification, we incorporate model-agnostic explanation methods, e.g. LIME, to extract feature attributions from the model outputs. Under specific triggering conditions, these explanations reveal a distinct and verifiable watermark. We further design the watermark as a QR code to facilitate robust and recognizable ownership claims. We conducted extensive experiments across four medical imaging datasets (CMR dataset from UK Biobank, the SEG fundus dataset, the EchoNet echocardiography dataset, and the PraNet colonoscopy dataset) and five mainstream segmentation models. The results demonstrate the effectiveness, stealthiness, and harmlessness of our method on the original modelâs segmentation performance. For example, when applied to the SAM model, StealthMark consistently achieved attack success rates (ASR) above 95% across various datasets while maintaining less than a 1% drop in Dice and AUC scoresâsignificantly outperforming backdoor-based watermarking methods and highlighting its strong potential for practical deployment. Our implementation code is made available at https://github.com/Qinkaiyu/StealthMark