SP-Det: Self-Prompted Dual-Text Fusion for Generalized Multi-Label Lesion Detection

📅 2025-12-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current prompt-based chest X-ray lesion detection methods heavily rely on labor-intensive, expert-curated textual prompts—rendering them costly and clinically impractical. To address this, we propose the first expert-annotation-free multi-label lesion detection framework. Our method introduces a Dual-language Text Prompt Generator (DTPG) that automatically synthesizes semantically rich, disease-specific Chinese–English prompts directly from input radiographs. Additionally, we design a Bidirectional Feature Enhancement (BFE) module to enable cross-modal co-optimization between text-guided and visual features. Evaluated on multiple public chest X-ray datasets, our approach consistently outperforms state-of-the-art methods, achieving absolute mAP improvements of 3.2–5.8 percentage points. Crucially, it eliminates dependence on manually engineered prompts while maintaining high diagnostic accuracy and strong clinical deployability.

Technology Category

Application Category

📝 Abstract
Automated lesion detection in chest X-rays has demonstrated significant potential for improving clinical diagnosis by precisely localizing pathological abnormalities. While recent promptable detection frameworks have achieved remarkable accuracy in target localization, existing methods typically rely on manual annotations as prompts, which are labor-intensive and impractical for clinical applications. To address this limitation, we propose SP-Det, a novel self-prompted detection framework that automatically generates rich textual context to guide multi-label lesion detection without requiring expert annotations. Specifically, we introduce an expert-free dual-text prompt generator (DTPG) that leverages two complementary textual modalities: semantic context prompts that capture global pathological patterns and disease beacon prompts that focus on disease-specific manifestations. Moreover, we devise a bidirectional feature enhancer (BFE) that synergistically integrates comprehensive diagnostic context with disease-specific embeddings to significantly improve feature representation and detection accuracy. Extensive experiments on two chest X-ray datasets with diverse thoracic disease categories demonstrate that our SP-Det framework outperforms state-of-the-art detection methods while completely eliminating the dependency on expert-annotated prompts compared to existing promptable architectures.
Problem

Research questions and friction points this paper is trying to address.

Eliminates manual annotation dependency for lesion detection prompts
Automatically generates dual-text prompts for multi-label lesion localization
Improves detection accuracy without expert-annotated guidance requirements
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-prompted framework generates textual context automatically
Dual-text prompt generator uses semantic and disease-specific prompts
Bidirectional feature enhancer integrates diagnostic context for accuracy
🔎 Similar Papers
No similar papers found.
Q
Qing Xu
School of Computer Science, University of Nottingham Ningbo China, Ningbo, 31500, Zhejiang, China
Y
Yanqian Wang
School of Computer Science, University of Nottingham Ningbo China, Ningbo, 31500, Zhejiang, China
Xiangjian He
Xiangjian He
University of Nottingham Ningbo China (2022.5--), University of Technology Sydney (1999.2-2022.5)
Computer VisionMachine LearningData Analytics
Y
Yue Li
School of Computer Science, University of Nottingham Ningbo China, Ningbo, 31500, Zhejiang, China
Y
Yixuan Zhang
School of Computer Science, University of Nottingham Ningbo China, Ningbo, 31500, Zhejiang, China
Rong Qu
Rong Qu
University of Nottingham
Hyper-heuristicsVehicle RoutingAutomated Algorithm DesignCombinatorial Optimisation
Wenting Duan
Wenting Duan
University of Lincoln
computer visionimage processingmedical imaging
Z
Zhen Chen
Yale University, New Haven, CT 06510, USA