Enhanced Diagnostic Performance via Large-Resolution Inference Optimization for Pathology Foundation Models

📅 2026-01-17

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

This work addresses the limitations of existing foundation models in computational pathology, which are constrained by fixed input resolutions and often suffer from GPU memory overflow or loss of morphological detail when processing high-resolution whole-slide images (WSIs). To overcome these challenges, the authors propose an efficient inference strategy that employs a spatially aware sparse attention mechanism over neighboring patches to reduce computational overhead. Additionally, a global attention scoring scheme is introduced to identify and prune non-informative tokens, preserving diagnostically relevant details without increasing GPU memory usage. The method enables higher-resolution WSI analysis, achieving up to a 7.67% performance gain in region-of-interest classification tasks while maintaining competitive results in segmentation, thereby significantly enhancing downstream task performance.

Technology Category

Application Category

📝 Abstract

Despite their prominent performance on tasks such as ROI classification and segmentation, many pathology foundation models remain constrained by a specific input size e.g. 224 x 224, creating substantial inefficiencies when applied to whole-slide images (WSIs), which span thousands of resolutions. A naive strategy is to either enlarge inputs or downsample the WSIs. However, enlarging inputs results in prohibitive GPU memory consumption, while downsampling alters the microns-per-pixel resolution and obscures critical morphological details. To overcome these limitations, we propose an space- and time- efficient inference strategy that sparsifies attention using spatially aware neighboring blocks and filters out non-informative tokens through global attention scores. This design substantially reduces GPU memory and runtime during high-resolution WSI inference while preserving and even improving the downstream performance, enabling inference at higher resolutions under the same GPU budget. The experimental results show that our method can achieves up to an 7.67% improvement in the ROI classification and compatible results in segmentation.

Problem

Research questions and friction points this paper is trying to address.

whole-slide images

input resolution constraint

GPU memory consumption

morphological detail loss

pathology foundation models

Innovation

Methods, ideas, or system contributions that make the work stand out.

large-resolution inference

pathology foundation models

attention sparsification