A Two-Stage Strategy for Mitosis Detection Using Improved YOLO11x Proposals and ConvNeXt Classification

📅 2025-09-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The MIDOG 2025 Track 1 task requires precise mitosis detection in whole-slide images (WSIs) containing non-neoplastic, inflammatory, and necrotic regions; however, complex tissue backgrounds and artifacts severely degrade detection performance, leading to high false positives and negatives and limiting F1-score. To address this, we propose a lightweight two-stage detection framework: Stage 1 employs an enhanced YOLOv11x architecture—incorporating EMA-based attention and LSConv modules—to generate high-recall candidate bounding boxes under a low-confidence threshold; Stage 2 refines these candidates using a ConvNeXt-Tiny classifier for accurate discrimination. Evaluated on the fused MIDOG dataset, our method achieves an F1-score of 0.882 (+0.035), precision of 0.839 (+0.077), and stable recall. Our key contributions are (i) a computationally efficient two-stage paradigm tailored for WSI analysis, and (ii) structural enhancements to YOLOv11x specifically designed to improve robustness against WSI-specific confounders.

Technology Category

Application Category

📝 Abstract
MIDOG 2025 Track 1 requires mitosis detection in whole-slide images (WSIs) containing non-tumor, inflamed, and necrotic regions. Due to the complicated and heterogeneous context, as well as possible artifacts, there are often false positives and false negatives, thus degrading the detection F1-score. To address this problem, we propose a two-stage framework. Firstly, an improved YOLO11x, integrated with EMA attention and LSConv, is employed to generate mitosis candidates. We use a low confidence threshold to generate as many proposals as possible, ensuring the detection recall. Then, a ConvNeXt-Tiny classifier is employed to filter out the false positives, ensuring the detection precision. Consequently, the proposed two-stage framework can generate a high detection F1-score. Evaluated on a fused dataset comprising MIDOG++, MITOS_WSI_CCMCT, and MITOS_WSI_CMC, our framework achieves an F1-score of 0.882, which is 0.035 higher than the single-stage YOLO11x baseline. This performance gain is produced by a significant precision improvement, from 0.762 to 0.839, and a comparable recall. The code is available at https://github.com/xxiao0304/MIDOG-2025-Track-1-of-SZTU.
Problem

Research questions and friction points this paper is trying to address.

Detecting mitosis in whole-slide images with complex contexts
Reducing false positives and negatives in mitosis detection
Improving F1-score for mitosis detection in WSIs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Improved YOLO11x with EMA attention and LSConv
Two-stage framework with low threshold proposals
ConvNeXt-Tiny classifier filters false positives
🔎 Similar Papers
No similar papers found.
Jie Xiao
Jie Xiao
University of Science and Technology of China
low level visiongenerative modelmachine learning
Mengye Lyu
Mengye Lyu
Shenzhen Technology University
MRI
S
Shaojun Liu
College of Health Science and Environmental Engineering, Shenzhen Technology University, Shenzhen, 518118, China