Prostate biopsy whole slide image dataset from an underrepresented Middle Eastern population

📅 2025-12-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Publicly available whole-slide imaging (WSI) datasets for histopathology are predominantly derived from Western populations, with severe underrepresentation of regions such as the Middle East—where digital pathology infrastructure remains limited—thereby hindering the cross-population generalizability of AI models. Method: We introduce the first prostate needle biopsy WSI dataset from Erbil, Iraq, comprising 339 WSIs from 185 patients, acquired in native formats using Leica, Hamamatsu, and Grundium scanners. All slides were independently annotated by three board-certified pathologists for Gleason score and ISUP grade group, followed by rigorous de-identification. Contribution/Results: This dataset fills a critical gap in Middle Eastern digital prostate pathology. It enables robust cross-scanner evaluation, color normalization research, and multi-expert inter-observer agreement analysis. Released under the CC BY 4.0 license via BioImage Archive, it significantly enhances reproducibility and validation of AI models across diverse global populations and heterogeneous scanning platforms.

Technology Category

Application Category

📝 Abstract
Artificial intelligence (AI) is increasingly used in digital pathology. Publicly available histopathology datasets remain scarce, and those that do exist predominantly represent Western populations. Consequently, the generalizability of AI models to populations from less digitized regions, such as the Middle East, is largely unknown. This motivates the public release of our dataset to support the development and validation of pathology AI models across globally diverse populations. We present 339 whole-slide images of prostate core needle biopsies from a consecutive series of 185 patients collected in Erbil, Iraq. The slides are associated with Gleason scores and International Society of Urological Pathology grades assigned independently by three pathologists. Scanning was performed using two high-throughput scanners (Leica and Hamamatsu) and one compact scanner (Grundium). All slides were de-identified and are provided in their native formats without further conversion. The dataset enables grading concordance analyses, color normalization, and cross-scanner robustness evaluations. Data will be deposited in the Bioimage Archive (BIA) under accession code: to be announced (TBA), and released under a CC BY 4.0 license.
Problem

Research questions and friction points this paper is trying to address.

Addresses scarcity of non-Western histopathology datasets for AI
Enables validation of AI models on Middle Eastern prostate biopsy images
Supports grading concordance and cross-scanner robustness analyses
Innovation

Methods, ideas, or system contributions that make the work stand out.

Publicly releasing Middle Eastern prostate biopsy dataset
Using multiple scanners for cross-scanner robustness evaluations
Providing native-format slides with independent pathologist grades
🔎 Similar Papers
No similar papers found.
P
Peshawa J. Muhammad Ali
Department of Mechanical and Manufacturing Engineering, Koya University, Koya, Kurdistan Region, Iraq
N
Navin Vincent
Department of Medical Epidemiology and Biostatistics, SciLifeLab, Karolinska Institutet, Stockholm, Sweden
S
Saman S. Abdulla
College of Dentistry, Hawler Medical University, Erbil, Kurdistan Region, Iraq
H
Han N. Mohammed Fadhl
College of Dentistry, University of Sulaimani, Sulaymaniyah, Kurdistan Region, Iraq
A
A. Blilie
Department of Pathology, Stavanger University Hospital, Stavanger, Norway
K
K. Szolnoky
Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
J
Julia Anna Mielcarz
Department of Medical Epidemiology and Biostatistics, SciLifeLab, Karolinska Institutet, Stockholm, Sweden
Xiaoyi Ji
Xiaoyi Ji
Karolinska Institutet
K
K. Kartasalo
Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
A
Abdulbasit K. Al-Talabani
Department of Software Engineering, Koya University, Koya, Kurdistan Region, Iraq
N
N. Mulliqi
Department of Medical Epidemiology and Biostatistics, SciLifeLab, Karolinska Institutet, Stockholm, Sweden