🤖 AI Summary
Current prompt-based chest X-ray lesion detection methods heavily rely on labor-intensive, expert-curated textual prompts—rendering them costly and clinically impractical. To address this, we propose the first expert-annotation-free multi-label lesion detection framework. Our method introduces a Dual-language Text Prompt Generator (DTPG) that automatically synthesizes semantically rich, disease-specific Chinese–English prompts directly from input radiographs. Additionally, we design a Bidirectional Feature Enhancement (BFE) module to enable cross-modal co-optimization between text-guided and visual features. Evaluated on multiple public chest X-ray datasets, our approach consistently outperforms state-of-the-art methods, achieving absolute mAP improvements of 3.2–5.8 percentage points. Crucially, it eliminates dependence on manually engineered prompts while maintaining high diagnostic accuracy and strong clinical deployability.
📝 Abstract
Automated lesion detection in chest X-rays has demonstrated significant potential for improving clinical diagnosis by precisely localizing pathological abnormalities. While recent promptable detection frameworks have achieved remarkable accuracy in target localization, existing methods typically rely on manual annotations as prompts, which are labor-intensive and impractical for clinical applications. To address this limitation, we propose SP-Det, a novel self-prompted detection framework that automatically generates rich textual context to guide multi-label lesion detection without requiring expert annotations. Specifically, we introduce an expert-free dual-text prompt generator (DTPG) that leverages two complementary textual modalities: semantic context prompts that capture global pathological patterns and disease beacon prompts that focus on disease-specific manifestations. Moreover, we devise a bidirectional feature enhancer (BFE) that synergistically integrates comprehensive diagnostic context with disease-specific embeddings to significantly improve feature representation and detection accuracy. Extensive experiments on two chest X-ray datasets with diverse thoracic disease categories demonstrate that our SP-Det framework outperforms state-of-the-art detection methods while completely eliminating the dependency on expert-annotated prompts compared to existing promptable architectures.