Benchmarking Deep Learning and Vision Foundation Models for Atypical vs. Normal Mitosis Classification with Cross-Dataset Evaluation

📅 2025-06-26

📈 Citations: 0

✨ Influential: 0

career value

173K/year

🤖 AI Summary

Accurate identification of atypical mitoses (AtM) in breast cancer histopathology remains challenging due to subtle morphological distinctions from normal mitoses (NM), low prevalence, poor inter-annotator agreement, and severe class imbalance. Method: We establish the first multi-center benchmark for AtM–NM classification, comprising two newly released hold-out datasets—AtNorM-Br and AtNorM-MD—and systematically evaluate fine-grained abnormality recognition using medical vision foundation models (e.g., Virchow). We compare LoRA-based fine-tuning against linear probing within a PyTorch framework, rigorously assessing generalization across domains. Results: Virchow+LoRA achieves state-of-the-art performance: 0.8135 balanced accuracy intra-domain and 0.7696/0.7705 cross-domain on AtNorM-Br/AtNorM-MD, respectively. All code, models, and datasets are publicly released, providing a reproducible benchmark and practical solution for fine-grained anomaly detection in digital pathology.

Technology Category

Application Category

📝 Abstract

Atypical mitoses mark a deviation in the cell division process that can be an independent prognostically relevant marker for tumor malignancy. However, their identification remains challenging due to low prevalence, at times subtle morphological differences from normal mitoses, low inter-rater agreement among pathologists, and class imbalance in datasets. Building on the Atypical Mitosis dataset for Breast Cancer (AMi-Br), this study presents a comprehensive benchmark comparing deep learning approaches for automated atypical mitotic figure (AMF) classification, including baseline models, foundation models with linear probing, and foundation models fine-tuned with low-rank adaptation (LoRA). For rigorous evaluation, we further introduce two new hold-out AMF datasets - AtNorM-Br, a dataset of mitoses from the The TCGA breast cancer cohort, and AtNorM-MD, a multi-domain dataset of mitoses from the MIDOG++ training set. We found average balanced accuracy values of up to 0.8135, 0.7696, and 0.7705 on the in-domain AMi-Br and the out-of-domain AtNorm-Br and AtNorM-MD datasets, respectively, with the results being particularly good for LoRA-based adaptation of the Virchow-line of foundation models. Our work shows that atypical mitosis classification, while being a challenging problem, can be effectively addressed through the use of recent advances in transfer learning and model fine-tuning techniques. We make available all code and data used in this paper in this github repository: https://github.com/DeepMicroscopy/AMi-Br_Benchmark.

Problem

Research questions and friction points this paper is trying to address.

Classifying atypical vs normal mitoses in cancer cells

Addressing low prevalence and class imbalance in datasets

Evaluating deep learning models for cross-dataset performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Benchmarking deep learning for mitosis classification

Using foundation models with LoRA fine-tuning

Cross-dataset evaluation with new AMF datasets

🔎 Similar Papers

Evaluating deep learning models for breast cancer classification: a comparative study