A Comprehensive Benchmark of Histopathology Foundation Models for Kidney Histopathology

📅 2026-03-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the inadequate performance of existing general-purpose histopathology foundation models on non-neoplastic chronic kidney disease tasks. We present the first comprehensive evaluation framework tailored to renal pathology, integrating multiple stains, scales, and tasks, and systematically assess 11 publicly available foundation models across 11 kidney-specific downstream tasks. To ensure rigor, we employ repeated stratified group cross-validation and nested cross-validation, complemented by Friedman tests, Wilcoxon signed-rank tests, and Holm–Bonferroni correction. Our findings reveal that while models perform adequately on mesoscopic structural diagnosis, they exhibit significant limitations in microscopic structure recognition, complex phenotype interpretation, and prognostic inference, highlighting their constrained ability to capture subtle renal pathological signals. The reproducible evaluation toolkit, kidney-hfm-eval, is openly released.

Technology Category

Application Category

📝 Abstract
Histopathology foundation models (HFMs), pretrained on large-scale cancer datasets, have advanced computational pathology. However, their applicability to non-cancerous chronic kidney disease remains underexplored, despite coexistence of renal pathology with malignancies such as renal cell and urothelial carcinoma. We systematically evaluate 11 publicly available HFMs across 11 kidney-specific downstream tasks spanning multiple stains (PAS, H&E, PASM, and IHC), spatial scales (tile and slide-level), task types (classification, regression, and copy detection), and clinical objectives, including detection, diagnosis, and prognosis. Tile-level performance is assessed using repeated stratified group cross-validation, while slide-level tasks are evaluated using repeated nested stratified cross-validation. Statistical significance is examined using Friedman test followed by pairwise Wilcoxon signed-rank testing with Holm-Bonferroni correction and compact letter display visualization. To promote reproducibility, we release an open-source Python package, kidney-hfm-eval, available at https://pypi.org/project/kidney-hfm-eval/ , that reproduces the evaluation pipelines. Results show moderate to strong performance on tasks driven by coarse meso-scale renal morphology, including diagnostic classification and detection of prominent structural alterations. In contrast, performance consistently declines for tasks requiring fine-grained microstructural discrimination, complex biological phenotypes, or slide-level prognostic inference, largely independent of stain type. Overall, current HFMs appear to encode predominantly static meso-scale representations and may have limited capacity to capture subtle renal pathology or prognosis-related signals. Our results highlight the need for kidney-specific, multi-stain, and multimodal foundation models to support clinically reliable decision-making in nephrology.
Problem

Research questions and friction points this paper is trying to address.

histopathology foundation models
chronic kidney disease
computational pathology
renal pathology
prognostic inference
Innovation

Methods, ideas, or system contributions that make the work stand out.

Histopathology foundation models
Chronic kidney disease
Multi-stain benchmarking
Computational pathology
Reproducible evaluation
🔎 Similar Papers
No similar papers found.
H
Harishwar Reddy Kasireddy
Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL, USA
P
Patricio S. La Rosa
Seed Production Innovation, Crop Science Division, Bayer Company, St. Louis, MO, USA
Akshita Gupta
Akshita Gupta
TU Darmstadt
Deep LearningSpeech & Audio ProcessingComputer Vision
Anindya S. Paul
Anindya S. Paul
Assistant Scientist, University of Florida
AI ScientistMedical ImagingAI privacyEx-Intel Labs
J
Jamie L. Fermin
Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL, USA
W
William L. Clapp
Department of Pathology, Immunology and Laboratory Medicine, University of Florida College of Medicine, Gainesville, FL, USA
M
Meryl A. Waldman
Kidney Disease Branch, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, USA
T
Tarek M. El-Ashkar
Indiana University School of Medicine, Indianapolis, IN, USA
S
Sanjay Jain
Departments of Medicine, Washington University School of Medicine, St. Louis, MO, USA
Luis Rodrigues
Luis Rodrigues
INESC-ID, Instituto Superior Técnico, Universidade de Lisboa
Distributed Systems
K
Kuang Yu Jen
Department of Pathology and Laboratory Medicine, University of California at Davis School of Medicine, Sacramento, CA, USA
Avi Z. Rosenberg
Avi Z. Rosenberg
Johns Hopkins University-School of Medicine
Renal PathologyPlacental PathologyPostmortem PathologyProteomicsMicrodissection
M
Michael T. Eadon
Indiana University School of Medicine, Indianapolis, IN, USA
J
Jeffrey B. Hodgin
Department of Pathology, University of Michigan, Ann Arbor, MI, USA
Pinaki Sarder
Pinaki Sarder
Associate Professor, Medicine: Quantitative Health, University of Florida at Gainesville
Digital PathologyImage AnalysisImage Processing.