🤖 AI Summary
Prostate cancer whole-slide image (WSI) grading faces challenges including extreme image scale, high tissue heterogeneity, and difficulty in localizing diagnostically critical regions; existing methods suffer from redundant information due to static patch sampling, limiting performance. We propose a Graph Laplacian-constrained Transformer architecture with a progressive iterative sampling mechanism—the first to jointly integrate graph filtering, learnable spatial regularization, and importance scoring from a frozen foundation model—enabling dynamic selection of diagnosis-relevant regions and spatially consistent modeling. Our method combines ResNet50-based local feature extraction, convex combination aggregation, and Graph Laplacian regularization. Evaluated on five public and one private dataset, it consistently outperforms state-of-the-art approaches, achieving significant improvements in grading accuracy and spatial structural fidelity while maintaining efficient inference.
📝 Abstract
Prostate cancer grading from whole-slide images (WSIs) remains a challenging task due to the large-scale nature of WSIs, the presence of heterogeneous tissue structures, and difficulty of selecting diagnostically relevant regions. Existing approaches often rely on random or static patch selection, leading to the inclusion of redundant or non-informative regions that degrade performance. To address this, we propose a Graph Laplacian Attention-Based Transformer (GLAT) integrated with an Iterative Refinement Module (IRM) to enhance both feature learning and spatial consistency. The IRM iteratively refines patch selection by leveraging a pretrained ResNet50 for local feature extraction and a foundation model in no-gradient mode for importance scoring, ensuring only the most relevant tissue regions are preserved. The GLAT models tissue-level connectivity by constructing a graph where patches serve as nodes, ensuring spatial consistency through graph Laplacian constraints and refining feature representations via a learnable filtering mechanism that enhances discriminative histological structures. Additionally, a convex aggregation mechanism dynamically adjusts patch importance to generate a robust WSI-level representation. Extensive experiments on five public and one private dataset demonstrate that our model outperforms state-of-the-art methods, achieving higher performance and spatial consistency while maintaining computational efficiency.