PhenoBench: A Comprehensive Benchmark for Cell Phenotyping

📅 2025-07-04
📈 Citations: 0
Influential: 0
📄 PDF

career value

211K/year
🤖 AI Summary
In digital pathology, existing foundation models lack a unified, clinically representative benchmark for evaluating generalization in cellular phenotyping. To address this, we introduce PhenoBench—the first fine-grained cellular phenotyping benchmark for H&E-stained whole-slide images—comprising PhenoCell, a high-quality, multi-omics–validated dataset with annotations for 14 cell types, and an open-source, standardized framework for fine-tuning and evaluation. Systematic evaluation reveals that state-of-the-art models achieve F1 scores >0.70 on conventional benchmarks (e.g., Lizard, PanNuke) but drop sharply to ~0.20 on PhenoCell, exposing a critical clinical generalization gap. By incorporating more realistic data distributions and demanding dense pixel-level prediction tasks, PhenoBench establishes a significantly more challenging and clinically relevant evaluation standard. This benchmark advances the alignment of pathology foundation models with real-world clinical complexity.

Technology Category

Application Category

📝 Abstract
Digital pathology has seen the advent of a wealth of foundational models (FM), yet to date their performance on cell phenotyping has not been benchmarked in a unified manner. We therefore propose PhenoBench: A comprehensive benchmark for cell phenotyping on Hematoxylin and Eosin (H&E) stained histopathology images. We provide both PhenoCell, a new H&E dataset featuring 14 granular cell types identified by using multiplexed imaging, and ready-to-use fine-tuning and benchmarking code that allows the systematic evaluation of multiple prominent pathology FMs in terms of dense cell phenotype predictions in different generalization scenarios. We perform extensive benchmarking of existing FMs, providing insights into their generalization behavior under technical vs. medical domain shifts. Furthermore, while FMs achieve macro F1 scores > 0.70 on previously established benchmarks such as Lizard and PanNuke, on PhenoCell, we observe scores as low as 0.20. This indicates a much more challenging task not captured by previous benchmarks, establishing PhenoCell as a prime asset for future benchmarking of FMs and supervised models alike. Code and data are available on GitHub.
Problem

Research questions and friction points this paper is trying to address.

Benchmarking cell phenotyping performance of foundational models
Evaluating model generalization under domain shifts
Providing a challenging dataset for future FM assessments
Innovation

Methods, ideas, or system contributions that make the work stand out.

New H&E dataset with 14 cell types
Ready-to-use fine-tuning and benchmarking code
Evaluates FMs under domain shifts
💼 Related Jobs
Postdoctoral Fellow – AI-Driven Multi-Omics Integration for Predictive Toxicology
Pfizer
The annual base salary for this position ranges from $64,600.00 to $107,600.00. In addition, this position is eligible for participation in Pfizer’s Global Performance Plan with a bonus target of 7.5% of the base salary. We offer comprehensive and generous benefits and programs to help our colleagues lead healthy lives and to support each of life’s moments. Benefits offered include a 401(k) plan with Pfizer Matching Contributions and an additional Pfizer Retirement Savings Contribution, paid vacation, holiday and personal days, paid caregiver/parental and medical leave, and health benefits to include medical, prescription drug, dental and vision coverage. Learn more at Pfizer Candidate Site – U.S. Benefits | (uscandidates.mypfizerbenefits.com). Pfizer compensation structures and benefit packages are aligned based on the location of hire. The United States salary range provided does not apply to Tampa, FL or any location outside of the United States. Relocation assistance may be available based on business needs and/or eligibility.
Hybrid
J
Jerome Luescher
Charité Universitätsmedizin, Berlin, Germany; Helmholtz Imaging; Max-Delbrück Center, Berlin, Germany
N
Nora Koreuber
Charité Universitätsmedizin, Berlin, Germany; Helmholtz Imaging; Max-Delbrück Center, Berlin, Germany
J
Jannik Franzen
Charité Universitätsmedizin, Berlin, Germany; Universität Potsdam, Potsdam, Germany; Helmholtz Imaging; Max-Delbrück Center, Berlin, Germany
F
Fabian H. Reith
Charité Universitätsmedizin, Berlin, Germany; Humboldt-Universität zu Berlin, Berlin, Germany; Helmholtz Imaging; Max-Delbrück Center, Berlin, Germany
C
Claudia Winklmayr
Charité Universitätsmedizin, Berlin, Germany; Helmholtz Imaging; Max-Delbrück Center, Berlin, Germany
C
Christian M. Schuerch
Department of Pathology and Neuropathology, University Hospital and Comprehensive Cancer Center Tübingen, Tübingen, Germany; Cluster of Excellence iFIT (EXC 2180) "Image-Guided and Functionally Instructed Tumor Therapies", University of Tübingen, Germany
Dagmar Kainmueller
Dagmar Kainmueller
MDC Berlin
J
Josef Lorenz Rumberger
Humboldt-Universität zu Berlin, Berlin, Germany; Helmholtz Imaging; Max-Delbrück Center, Berlin, Germany