🤖 AI Summary
This work addresses the susceptibility of deep learning models for whole-slide image (WSI) analysis to non-semantic shortcuts—such as background color and brightness biases—leading to overfitting and poor generalization. We propose the first model-agnostic, lightweight, and plug-and-play framework for shortcut detection and diagnosis. Our method integrates gradient masking, perturbation sensitivity analysis, and self-supervised contrastive learning to enable interpretable, architecture- and task-agnostic bias identification. It operates efficiently on a single consumer-grade GPU. For the first time, we systematically uncover multiple novel, latent data shortcuts in foundational pathology models, while reproducing and extending known biases previously observed in self-supervised models. The open-source toolkit, released on GitHub, has been adopted by the community and demonstrably enhances model robustness and clinical applicability.
📝 Abstract
Even foundational models that are trained on datasets with billions of data samples may develop shortcuts that lead to overfitting and bias. Shortcuts are non-relevant patterns in data, such as the background color or color intensity. So, to ensure the robustness of deep learning applications, there is a need for methods to detect and remove such shortcuts. Today's model debugging methods are time consuming since they often require customization to fit for a given model architecture in a specific domain. We propose a generalized, model-agnostic framework to debug deep learning models. We focus on the domain of histopathology, which has very large images that require large models - and therefore large computation resources. It can be run on a workstation with a commodity GPU. We demonstrate that our framework can replicate non-image shortcuts that have been found in previous work for self-supervised learning models, and we also identify possible shortcuts in a foundation model. Our easy to use tests contribute to the development of more reliable, accurate, and generalizable models for WSI analysis. Our framework is available as an open-source tool available on github.