CGF-DETR: Cross-Gated Fusion DETR for Enhanced Pneumonia Detection in Chest X-rays

📅 2025-11-03

📈 Citations: 0

✨ Influential: 0

career value

167K/year

🤖 AI Summary

To address the limited multi-scale feature extraction and dynamic fusion capabilities of Transformer-based models in chest X-ray pneumonia detection, this paper proposes a lightweight real-time detection framework. Methodologically, it introduces three key innovations: (1) a cross-gated fusion mechanism for adaptive inter-scale feature interaction; (2) novel modules—XFABlock, SPGA, and GCFC3—that jointly enhance multi-scale representation learning and efficient information aggregation; and (3) a synergistic integration of convolutional attention, Cross-Stage Partial (CSP) architecture, single-head self-attention, and structural reparameterization across the backbone, neck, and detection head. Evaluated on the RSNA dataset, the model achieves mAP@0.5 = 82.2% (+3.7% over baseline), mAP@[0.5:0.95] = 50.4%, and an inference speed of 48.1 FPS—demonstrating a significant balance between detection accuracy and real-time performance.

Technology Category

Application Category

📝 Abstract

Pneumonia remains a leading cause of morbidity and mortality worldwide, necessitating accurate and efficient automated detection systems. While recent transformer-based detectors like RT-DETR have shown promise in object detection tasks, their application to medical imaging, particularly pneumonia detection in chest X-rays, remains underexplored. This paper presents CGF-DETR, an enhanced real-time detection transformer specifically designed for pneumonia detection. We introduce XFABlock in the backbone to improve multi-scale feature extraction through convolutional attention mechanisms integrated with CSP architecture. To achieve efficient feature aggregation, we propose SPGA module that replaces standard multi-head attention with dynamic gating mechanisms and single-head self-attention. Additionally, GCFC3 is designed for the neck to enhance feature representation through multi-path convolution fusion while maintaining real-time performance via structural re-parameterization. Extensive experiments on the RSNA Pneumonia Detection dataset demonstrate that CGF-DETR achieves 82.2% mAP@0.5, outperforming the baseline RT-DETR-l by 3.7% while maintaining comparable inference speed at 48.1 FPS. Our ablation studies confirm that each proposed module contributes meaningfully to the overall performance improvement, with the complete model achieving 50.4% mAP@[0.5:0.95]

Problem

Research questions and friction points this paper is trying to address.

Enhancing pneumonia detection accuracy in chest X-rays

Improving multi-scale feature extraction for medical imaging

Optimizing real-time performance while maintaining detection precision

Innovation

Methods, ideas, or system contributions that make the work stand out.

XFABlock enhances multi-scale feature extraction via attention

SPGA module replaces multi-head attention with gating mechanisms

GCFC3 improves feature fusion using multi-path convolution

🔎 Similar Papers

No similar papers found.