🤖 AI Summary
Computational cytology faces two key challenges: unreliable instance-level labels and extremely low observation rates (e.g., only 0.5%). To address these, we propose a slide-label-aware multi-task pretraining framework. First, we jointly optimize weakly supervised similarity learning and self-supervised contrastive learning to extract fine-grained discriminative features from sparse annotations. Second, we introduce an adaptive gradient surgery mechanism to mitigate inter-task gradient conflicts and ensure training stability. Third, we design an attention-based multiple-instance aggregator that simultaneously optimizes bag-level classification and abnormal cell retrieval. Evaluated on a bone marrow cytology dataset, our method achieves significant improvements in bag-level F1 score and Top-400 positive cell retrieval accuracy—particularly under ultra-low observation rates—outperforming existing approaches. This work establishes a scalable, weakly supervised paradigm for computational cytology.
📝 Abstract
Computational cytology faces two major challenges: i) instance-level labels are unreliable and prohibitively costly to obtain, ii) witness rates are extremely low. We propose SLAM-AGS, a Slide-Label-Aware Multitask pretraining framework that jointly optimizes (i) a weakly supervised similarity objective on slide-negative patches and (ii) a self-supervised contrastive objective on slide-positive patches, yielding stronger performance on downstream tasks. To stabilize learning, we apply Adaptive Gradient Surgery to tackle conflicting task gradients and prevent model collapse. We integrate the pretrained encoder into an attention-based Multiple Instance Learning aggregator for bag-level prediction and attention-guided retrieval of the most abnormal instances in a bag. On a publicly available bone-marrow cytology dataset, with simulated witness rates from 10% down to 0.5%, SLAM-AGS improves bag-level F1-Score and Top 400 positive cell retrieval over other pretraining methods, with the largest gains at low witness rates, showing that resolving gradient interference enables stable pretraining and better performance on downstream tasks. To facilitate reproducibility, we share our complete implementation and evaluation framework as open source: https://github.com/Ace95/SLAM-AGS.