🤖 AI Summary
This work addresses the high computational cost and inefficiency of conventional whole-slide image analysis, which relies on multiple instance learning and processes a large number of high-magnification image patches. The authors propose PathCTM, a novel model that formulates pathological diagnosis as a dynamic, sequential reasoning process. Starting from a low-magnification global view, PathCTM employs an attention mechanism to guide region pruning and adaptively switches magnification scales or terminates inference early based on prediction confidence. By integrating conditional computation, dynamic scale selection, and early stopping, the method achieves both high diagnostic accuracy and remarkable efficiency. Experimental results demonstrate that PathCTM reduces patch usage by 95.95% and inference time by 95.62% compared to baseline approaches, while preserving AUC performance without degradation.
📝 Abstract
Traditional whole slide image (WSI) analysis methods typically rely on the multiple instance learning (MIL) paradigm, which extracts patch-level features at high magnification and aggregates them for slide-level prediction. However, such exhaustive patch-level processing is computationally expensive, severely limiting the efficiency and scalability of WSI analysis. To address this challenge, we propose PathCTM (a Pathology-oriented Continuous Thought Model) that enables token-efficient scale-space continuous reasoning for gigapixel WSIs. PathCTM formulates diagnostic inference as a dynamic sequential information pursuit. It progressively transitions from low-magnification global to high-magnification local inspection, and adaptively terminates inference when sufficient evidence is gathered to effectively bound decision uncertainty. Specifically, it uses conditional computation for dynamic scale switching with attention-guided region pruning, coupled with confidence-aware early stopping. Extensive experiments demonstrate that, compared with standard MIL-based methods, PathCTM reduces the number of required image patches by 95.95% and shortens inference time by approximately 95.62%, while maintaining AUC without degradation. Code is available at https://github.com/JSGe-AI/PathCTM.