🤖 AI Summary
To address inter-stain tissue misalignment in hematoxylin & eosin (H&E) and multiplex immunohistochemistry (IHC) whole-slide images (WSIs) caused by imperfect registration, this paper proposes a two-stage cross-stain contrastive learning (CSCL) framework. Methodologically, we construct the first five-stain paired WSI dataset, introduce a lightweight adapter and a cross-stain attention fusion module to achieve patch-level inter-stain feature alignment, and jointly optimize multiple instance learning (MIL) and global contrastive losses for self-supervised pretraining. This work presents the first framework enabling collaborative representation learning across H&E and multiplex IHC stains. Experiments demonstrate that the learned representations significantly improve performance on cancer subtype classification, IHC biomarker prediction, and survival analysis—achieving strong generalizability and cross-stain transferability.
📝 Abstract
Universal, transferable whole-slide image (WSI) representations are central to computational pathology. Incorporating multiple markers (e.g., immunohistochemistry, IHC) alongside H&E enriches H&E-based features with diverse, biologically meaningful information. However, progress is limited by the scarcity of well-aligned multi-stain datasets. Inter-stain misalignment shifts corresponding tissue across slides, hindering consistent patch-level features and degrading slide-level embeddings. To address this, we curated a slide-level aligned, five-stain dataset (H&E, HER2, KI67, ER, PGR) to enable paired H&E-IHC learning and robust cross-stain representation. Leveraging this dataset, we propose Cross-Stain Contrastive Learning (CSCL), a two-stage pretraining framework with a lightweight adapter trained using patch-wise contrastive alignment to improve the compatibility of H&E features with corresponding IHC-derived contextual cues, and slide-level representation learning with Multiple Instance Learning (MIL), which uses a cross-stain attention fusion module to integrate stain-specific patch features and a cross-stain global alignment module to enforce consistency among slide-level embeddings across different stains. Experiments on cancer subtype classification, IHC biomarker status classification, and survival prediction show consistent gains, yielding high-quality, transferable H&E slide-level representations. The code and data are available at https://github.com/lily-zyz/CSCL.