CARE: A Molecular-Guided Foundation Model with Adaptive Region Modeling for Whole Slide Image Analysis

📅 2026-02-25

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

Existing foundation models in computational pathology often rely on backbone networks pretrained on natural images, which struggle to capture the heterogeneity and non-uniformity of tissue morphology, thereby limiting clinical interpretability and utility. To address this, this work proposes CARE, a novel foundation model that uniquely integrates molecular information—specifically RNA and protein data—into the regional modeling of histopathology images. CARE employs a two-stage self-supervised pretraining strategy: first learning morphological representations from unlabeled whole-slide images, then leveraging molecular data to guide the generation of biologically meaningful and morphologically coherent adaptive tissue regions. Notably, CARE requires no manual segmentation and uses only one-tenth of the pretraining data typical of mainstream models, yet achieves superior average performance across 33 diverse downstream tasks—including morphological classification, molecular prediction, and survival analysis—demonstrating enhanced generalization and clinical relevance.

Technology Category

Application Category

📝 Abstract

Foundation models have recently achieved impressive success in computational pathology, demonstrating strong generalization across diverse histopathology tasks. However, existing models overlook the heterogeneous and non-uniform organization of pathological regions of interest (ROIs) because they rely on natural image backbones not tailored for tissue morphology. Consequently, they often fail to capture the coherent tissue architecture beyond isolated patches, limiting interpretability and clinical relevance. To address these challenges, we present Cross-modal Adaptive Region Encoder (CARE), a foundation model for pathology that automatically partitions WSIs into several morphologically relevant regions. Specifically, CARE employs a two-stage pretraining strategy: (1) a self-supervised unimodal pretraining stage that learns morphological representations from 34,277 whole-slide images (WSIs) without segmentation annotations, and (2) a cross-modal alignment stage that leverages RNA and protein profiles to refine the construction and representation of adaptive regions. This molecular guidance enables CARE to identify biologically relevant patterns and generate irregular yet coherent tissue regions, selecting the most representative area as ROI. CARE supports a broad range of pathology-related tasks, using either the ROI feature or the slide-level feature obtained by aggregating adaptive regions. Based on only one-tenth of the pretraining data typically used by mainstream foundation models, CARE achieves superior average performance across 33 downstream benchmarks, including morphological classification, molecular prediction, and survival analysis, and outperforms other foundation model baselines overall.

Problem

Research questions and friction points this paper is trying to address.

computational pathology

whole slide image

region of interest

tissue morphology

foundation model

Innovation

Methods, ideas, or system contributions that make the work stand out.

adaptive region modeling

cross-modal alignment

molecular-guided foundation model