š¤ AI Summary
Accurate three-class (0/low/high) immunohistochemical (IHC) scoring of HER2 is critical for precise molecular subtyping and prognostication in breast cancer; however, conventional manual assessment suffers from high inter-observer variability and poor reproducibility. To address this, we introduce the first large-scale, multi-institutional Indian pathological breast cancer datasetāIPD-Breastācomprising 1,272 whole-slide images annotated for HER2, ER, and PR. We conduct the first systematic evaluation of low- versus high-resolution modeling for automated HER2 scoring and propose an end-to-end ConvNeXt-based whole-slide classification framework, eliminating reliance on computationally expensive patch-level processing and extensive fine-grained annotations. Our method achieves 91.79% AUC, 83.52% F1-score, and 83.56% accuracy on HER2 three-class classificationāimproving F1 by 5.35 percentage points over state-of-the-art patch-based approachesādemonstrating the efficacy and clinical translatability of low-resolution whole-slide modeling.
š Abstract
Breast cancer, the most common malignancy among women, requires precise detection and classification for effective treatment. Immunohistochemistry (IHC) biomarkers like HER2, ER, and PR are critical for identifying breast cancer subtypes. However, traditional IHC classification relies on pathologists' expertise, making it labor-intensive and subject to significant inter-observer variability. To address these challenges, this study introduces the India Pathology Breast Cancer Dataset (IPD-Breast), comprising of 1,272 IHC slides (HER2, ER, and PR) aimed at automating receptor status classification. The primary focus is on developing predictive models for HER2 3-way classification (0, Low, High) to enhance prognosis. Evaluation of multiple deep learning models revealed that an end-to-end ConvNeXt network utilizing low-resolution IHC images achieved an AUC, F1, and accuracy of 91.79%, 83.52%, and 83.56%, respectively, for 3-way classification, outperforming patch-based methods by over 5.35% in F1 score. This study highlights the potential of simple yet effective deep learning techniques to significantly improve accuracy and reproducibility in breast cancer classification, supporting their integration into clinical workflows for better patient outcomes.