🤖 AI Summary
Rare cancer subtyping faces critical challenges including scarcity of expert pathologists, limited labeled data, and poor model interpretability—particularly in pediatric oncology, where rare subtypes constitute over 70% of cases. While existing vision-language foundation models exhibit strong zero-shot performance on common cancers, their clinical utility for rare subtypes remains limited; mainstream multiple-instance learning approaches rely solely on visual features, lacking cross-modal semantic alignment and fine-grained tumor localization capability. To address these gaps, we propose PathPT—a novel framework that pioneers the use of vision-language models to generate slice-level weak supervision signals via zero-shot inference. PathPT integrates spatially aware feature aggregation with task-adaptive few-shot prompt tuning to achieve precise tumor region localization and cross-modal pathological semantic alignment. Evaluated on eight rare cancer datasets, PathPT achieves significant improvements in average subtyping accuracy, while simultaneously enhancing localization precision and interpretability—demonstrating robust generalizability across both adult and pediatric rare tumors.
📝 Abstract
Rare cancers comprise 20-25% of all malignancies but face major diagnostic challenges due to limited expert availability-especially in pediatric oncology, where they represent over 70% of cases. While pathology vision-language (VL) foundation models show promising zero-shot capabilities for common cancer subtyping, their clinical performance for rare cancers remains limited. Existing multi-instance learning (MIL) methods rely only on visual features, overlooking cross-modal knowledge and compromising interpretability critical for rare cancer diagnosis. To address this limitation, we propose PathPT, a novel framework that fully exploits the potential of vision-language pathology foundation models through spatially-aware visual aggregation and task-specific prompt tuning. Unlike conventional MIL, PathPT converts WSI-level supervision into fine-grained tile-level guidance by leveraging the zero-shot capabilities of VL models, thereby preserving localization on cancerous regions and enabling cross-modal reasoning through prompts aligned with histopathological semantics. We benchmark PathPT on eight rare cancer datasets(four adult and four pediatric) spanning 56 subtypes and 2,910 WSIs, as well as three common cancer datasets, evaluating four state-of-the-art VL models and four MIL frameworks under three few-shot settings. Results show that PathPT consistently delivers superior performance, achieving substantial gains in subtyping accuracy and cancerous region grounding ability. This work advances AI-assisted diagnosis for rare cancers, offering a scalable solution for improving subtyping accuracy in settings with limited access to specialized expertise.