🤖 AI Summary
To address the extreme scarcity of labeled data in whole-slide image (WSI) classification, this paper proposes the Meta-Optimized Classifier Framework (MOCF), the first approach to introduce meta-learning into few-shot WSI diagnosis. MOCF dynamically ensembles diverse candidate classifiers to adaptively model multi-scale pathological features. It unifies vision-language foundation models (VLFMs), multiple-instance learning (MIL), and meta-learning to enable end-to-end adaptive selection and fusion of classification strategies. Evaluated on few-shot benchmarks including TCGA-NSCLC, MOCF significantly outperforms state-of-the-art methods: it achieves an absolute AUC improvement of 10.4% over the best VLFM baseline, and up to 26.25% in the 1-shot setting. These results demonstrate MOCF’s strong generalization capability under ultra-low annotation budgets and its potential for clinical deployment.
📝 Abstract
Recent advances in histopathology vision-language foundation models (VLFMs) have shown promise in addressing data scarcity for whole slide image (WSI) classification via zero-shot adaptation. However, these methods remain outperformed by conventional multiple instance learning (MIL) approaches trained on large datasets, motivating recent efforts to enhance VLFM-based WSI classification through fewshot learning paradigms. While existing few-shot methods improve diagnostic accuracy with limited annotations, their reliance on conventional classifier designs introduces critical vulnerabilities to data scarcity. To address this problem, we propose a Meta-Optimized Classifier (MOC) comprising two core components: (1) a meta-learner that automatically optimizes a classifier configuration from a mixture of candidate classifiers and (2) a classifier bank housing diverse candidate classifiers to enable a holistic pathological interpretation. Extensive experiments demonstrate that MOC outperforms prior arts in multiple few-shot benchmarks. Notably, on the TCGA-NSCLC benchmark, MOC improves AUC by 10.4% over the state-of-the-art few-shot VLFM-based methods, with gains up to 26.25% under 1-shot conditions, offering a critical advancement for clinical deployments where diagnostic training data is severely limited. Code is available at https://github.com/xmed-lab/MOC.