🤖 AI Summary
Pathology foundation models (PFMs) struggle to adapt gigapixel whole-slide images (WSIs) to clinical tasks under weak supervision (i.e., WSI-level labels) on single-GPU hardware. Method: We propose TAPFM, an end-to-end task-adaptation framework that jointly optimizes patch-level feature representation and instance-level prediction via a learnable multiple-instance learning (MIL) aggregator built upon ViT’s self-attention mechanism. Crucially, TAPFM decouples the PFM backbone from the aggregator during training, enabling efficient fine-tuning on a single GPU. Results: On bladder cancer and lung adenocarcinoma mutation prediction tasks—requiring multi-label classification of clinically actionable mutations—TAPFM substantially outperforms state-of-the-art weakly supervised methods. Both inference and training run entirely on a single consumer-grade GPU (e.g., RTX 4090), achieving strong performance while ensuring practical deployability in resource-constrained clinical settings.
📝 Abstract
Pathology foundation models (PFMs) have emerged as powerful tools for analyzing whole slide images (WSIs). However, adapting these pretrained PFMs for specific clinical tasks presents considerable challenges, primarily due to the availability of only weak (WSI-level) labels for gigapixel images, necessitating multiple instance learning (MIL) paradigm for effective WSI analysis. This paper proposes a novel approach for single-GPU extbf{T}ask extbf{A}daptation of extbf{PFM}s (TAPFM) that uses vision transformer (vit) attention for MIL aggregation while optimizing both for feature representations and attention weights. The proposed approach maintains separate computational graphs for MIL aggregator and the PFM to create stable training dynamics that align with downstream task objectives during end-to-end adaptation. Evaluated on mutation prediction tasks for bladder cancer and lung adenocarcinoma across institutional and TCGA cohorts, TAPFM consistently outperforms conventional approaches, with H-Optimus-0 (TAPFM) outperforming the benchmarks. TAPFM effectively handles multi-label classification of actionable mutations as well. Thus, TAPFM makes adaptation of powerful pre-trained PFMs practical on standard hardware for various clinical applications.