🤖 AI Summary
Digital pathology whole-slide image (WSI) classification under the multiple-instance learning (MIL) paradigm faces key bottlenecks: high computational overhead, reliance on large-scale self-supervised learning (SSL) pretraining, and degraded cross-domain transfer performance. This paper proposes a lightweight MIL framework that eliminates redundant SSL pretraining and enables continual few-shot adaptation. Its core innovation is a pathology-specific sparse Transformer pooling mechanism: we theoretically prove it to be a universal approximator and derive the tightest known probabilistic upper bound on its required depth. This mechanism drastically reduces computational complexity while achieving state-of-the-art accuracy at both WSI- and patch-level on CAMELYON16 and TCGA lung cancer datasets. Notably, training costs are substantially reduced compared to prevailing methods.
📝 Abstract
Whole Slide Image (WSI) classification with multiple instance learning (MIL) in digital pathology faces significant computational challenges. Current methods mostly rely on extensive self-supervised learning (SSL) for satisfactory performance, requiring long training periods and considerable computational resources. At the same time, no pre-training affects performance due to domain shifts from natural images to WSIs. We introduce Snuffy architecture, a novel MIL-pooling method based on sparse transformers that mitigates performance loss with limited pre-training and enables continual few-shot pre-training as a competitive option. Our sparsity pattern is tailored for pathology and is theoretically proven to be a universal approximator with the tightest probabilistic sharp bound on the number of layers for sparse transformers, to date. We demonstrate Snuffy's effectiveness on CAMELYON16 and TCGA Lung cancer datasets, achieving superior WSI and patch-level accuracies. The code is available on https://github.com/jafarinia/snuffy.