🤖 AI Summary
In digital pathology, cancer subtype classification on gigapixel whole-slide images (WSIs) suffers from insufficient multi-scale information exploitation, poor interpretability, and model redundancy. To address these challenges, we propose a lightweight graph-structured multi-scale WSI representation framework that dynamically models hierarchical inter-scale relationships across magnifications, emulating pathologists’ diagnostic reasoning. Innovatively, we replace conventional pooling with convergent graph node aggregation—enabling interpretable, adaptive selection of discriminative magnifications and regions for the first time. Our method integrates a multi-scale feature pyramid, cross-magnification attention, and multiple-instance learning (MIL) enhancement. It achieves significant accuracy improvements over state-of-the-art methods on multiple cancer subtype benchmarks while reducing parameter count by over 60%. Clinical interpretability is validated by two board-certified pathologists. The framework is backbone-agnostic and exhibits strong hyperparameter robustness.
📝 Abstract
Cancer subtyping is one of the most challenging tasks in digital pathology, where Multiple Instance Learning (MIL) by processing gigapixel whole slide images (WSIs) has been in the spotlight of recent research. However, MIL approaches do not take advantage of inter- and intra-magnification information contained in WSIs. In this work, we present GRASP, a novel lightweight graph-structured multi-magnification framework for processing WSIs in digital pathology. Our approach is designed to dynamically emulate the pathologist's behavior in handling WSIs and benefits from the hierarchical structure of WSIs. GRASP, which introduces a convergence-based node aggregation mechanism replacing traditional pooling mechanisms, outperforms state-of-the-art methods by a high margin in terms of balanced accuracy, while being significantly smaller than the closest-performing state-of-the-art models in terms of the number of parameters. Our results show that GRASP is dynamic in finding and consulting with different magnifications for subtyping cancers, is reliable and stable across different hyperparameters, and can generalize when using features from different backbones. The model's behavior has been evaluated by two expert pathologists confirming the interpretability of the model's dynamic. We also provide a theoretical foundation, along with empirical evidence, for our work, explaining how GRASP interacts with different magnifications and nodes in the graph to make predictions. We believe that the strong characteristics yet simple structure of GRASP will encourage the development of interpretable, structure-based designs for WSI representation in digital pathology. Data and code can be found in https://github.com/AIMLab-UBC/GRASP