OpenPath: Open-Set Active Learning for Pathology Image Classification via Pre-trained Vision-Language Models

📅 2025-06-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the low annotation efficiency of active learning (AL) in pathological image classification under open-set scenarios—where out-of-distribution (OOD) samples severely degrade model training—this paper proposes the first open-set AL framework for digital pathology. Methodologically, it innovatively integrates a task-oriented multi-class prompt-based coarse screening mechanism with the DIS sampling strategy: pre-trained vision-language models (VLMs) coupled with class-specific prompts enable efficient in-distribution (ID)/OOD separation; prototype-guided ID screening (PIS) and entropy-guided stochastic sampling (EGSS) jointly optimize sample purity and informativeness. Evaluated on two large-scale public pathological datasets, our framework significantly accelerates model convergence and improves final classification accuracy. It achieves state-of-the-art ID sample purity—outperforming existing open-set AL approaches—while maintaining high labeling efficiency and robustness to OOD interference.

Technology Category

Application Category

📝 Abstract
Pathology image classification plays a crucial role in accurate medical diagnosis and treatment planning. Training high-performance models for this task typically requires large-scale annotated datasets, which are both expensive and time-consuming to acquire. Active Learning (AL) offers a solution by iteratively selecting the most informative samples for annotation, thereby reducing the labeling effort. However, most AL methods are designed under the assumption of a closed-set scenario, where all the unannotated images belong to target classes. In real-world clinical environments, the unlabeled pool often contains a substantial amount of Out-Of-Distribution (OOD) data, leading to low efficiency of annotation in traditional AL methods. Furthermore, most existing AL methods start with random selection in the first query round, leading to a significant waste of labeling costs in open-set scenarios. To address these challenges, we propose OpenPath, a novel open-set active learning approach for pathological image classification leveraging a pre-trained Vision-Language Model (VLM). In the first query, we propose task-specific prompts that combine target and relevant non-target class prompts to effectively select In-Distribution (ID) and informative samples from the unlabeled pool. In subsequent queries, Diverse Informative ID Sampling (DIS) that includes Prototype-based ID candidate Selection (PIS) and Entropy-Guided Stochastic Sampling (EGSS) is proposed to ensure both purity and informativeness in a query, avoiding the selection of OOD samples. Experiments on two public pathology image datasets show that OpenPath significantly enhances the model's performance due to its high purity of selected samples, and outperforms several state-of-the-art open-set AL methods. The code is available at href{https://github.com/HiLab-git/OpenPath}{https://github.com/HiLab-git/OpenPath}..
Problem

Research questions and friction points this paper is trying to address.

Reduces labeling effort in pathology image classification via active learning
Addresses inefficiency of traditional AL methods with OOD data
Improves sample selection purity and informativeness using VLMs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses pre-trained Vision-Language Model (VLM)
Task-specific prompts for initial query
Diverse Informative ID Sampling (DIS)
🔎 Similar Papers
No similar papers found.
Lanfeng Zhong
Lanfeng Zhong
University of Electronic Science and Technology of China & Shanghai AI Lab
Deep learningMedical Image AnalysisComputer Vision
X
Xin Liao
Department of Pathology, West China Second University Hospital, Sichuan University, Chengdu, China
S
Shichuan Zhang
Department of Pathology, West China Second University Hospital, Sichuan University, Chengdu, China
S
Shao-Ming Zhang
University of Electronic Science and Technology of China, Chengdu, China; Shanghai Artificial Intelligence Laboratory, Shanghai, China
Guotai Wang
Guotai Wang
Professor, University of Electronic Science and Technology of China
medical image analysiscomputer visiondeep learning