TICON: A Slide-Level Tile Contextualizer for Histopathology Representation Learning

📅 2025-12-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the limitation in whole-slide image (WSI) analysis where tile representations lack global context and struggle to jointly optimize local- and slide-level tasks, this paper proposes TICON—a Transformer-based universal tile contextualizer. Its core innovation is the first slide-level masked tile modeling pretraining paradigm, which achieves agnostic, unified enhancement of outputs from any tile encoder via cross-model embedding alignment and a lightweight slide-level aggregator. Pretrained on only 11K WSIs, TICON surpasses state-of-the-art (SOTA) slide aggregators trained on 350K samples. It establishes new SOTA performance across HEST-Bench, THUNDER, and CATCH (tile-level benchmarks) and Patho-Bench (slide-level benchmark), significantly improving downstream tasks including classification, segmentation, and survival prediction.

Technology Category

Application Category

📝 Abstract
The interpretation of small tiles in large whole slide images (WSI) often needs a larger image context. We introduce TICON, a transformer-based tile representation contextualizer that produces rich, contextualized embeddings for ''any'' application in computational pathology. Standard tile encoder-based pipelines, which extract embeddings of tiles stripped from their context, fail to model the rich slide-level information essential for both local and global tasks. Furthermore, different tile-encoders excel at different downstream tasks. Therefore, a unified model is needed to contextualize embeddings derived from ''any'' tile-level foundation model. TICON addresses this need with a single, shared encoder, pretrained using a masked modeling objective to simultaneously unify and contextualize representations from diverse tile-level pathology foundation models. Our experiments demonstrate that TICON-contextualized embeddings significantly improve performance across many different tasks, establishing new state-of-the-art results on tile-level benchmarks (i.e., HEST-Bench, THUNDER, CATCH) and slide-level benchmarks (i.e., Patho-Bench). Finally, we pretrain an aggregator on TICON to form a slide-level foundation model, using only 11K WSIs, outperforming SoTA slide-level foundation models pretrained with up to 350K WSIs.
Problem

Research questions and friction points this paper is trying to address.

Contextualizes small tile embeddings in whole slide images
Unifies diverse tile-level foundation models into one encoder
Improves performance across tile-level and slide-level pathology tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer-based contextualizer for tile embeddings
Masked modeling pretraining unifies diverse foundation models
Slide-level foundation model outperforms SoTA with fewer WSIs
🔎 Similar Papers
No similar papers found.
Varun Belagali
Varun Belagali
Stony Brook University
Computer VisionSelf-supervised Learning
Saarthak Kapse
Saarthak Kapse
PhD student, Stony Brook University
Medical ImagingDeep LearningComputer Vision
Pierre Marza
Pierre Marza
Postdoctoral Researcher, CentraleSupelec
Deep LearningComputer VisionMedical Imaging
S
Srijan Das
UNC Charlotte
Zilinghan Li
Zilinghan Li
Machine Learning Engineer, Argonne National Laboratory
Federated learningDistributed computingHigh Performance ComputingBiomedical Informatics
S
Sofiène Boutaj
MICS, CentraleSupélec, Université Paris-Saclay
Pushpak Pati
Pushpak Pati
Johnson & Johnson
Deep LearningComputer VisionMedical Imaging
Srikar Yellapragada
Srikar Yellapragada
PhD student at Stony Brook University
Computer VisionMachine learning
Tarak Nath Nandi
Tarak Nath Nandi
Assistant Computational Scientist, Argonne National Laboratory
GenomicsCancer BiologyArtificial IntelligenceCFD/TurbulenceMaterials Science
R
Ravi K Madduri
Argonne National Laboratory, University of Chicago
Joel Saltz
Joel Saltz
SUNY Distinguished Professor and Chair of Biomedical Informatics, Stony Brook University
High End ComputingSystems SoftwareBiomedical InformaticsPathology Informatics
Prateek Prasanna
Prateek Prasanna
Associate Professor, Stony Brook University
Medical VisionBiomedical image analysisRadiogenomicsRadiomicsComputational Pathology
Stergios Christodoulidis
Stergios Christodoulidis
Assistant Professor at CentraleSupélec
Machine LearningMedical Image Analysis
Maria Vakalopoulou
Maria Vakalopoulou
Assistant Professor at CentraleSupélec
Medical ImagingRemote SensingComputer VisionMachine LearningArtificial Intelligence
Dimitris Samaras
Dimitris Samaras
Stony Brook University
Computer VisionMachine LearningComputer GraphicsMedical Imaging