LesionDiffusion: Towards Text-controlled General Lesion Synthesis

📅 2025-03-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Medical lesion detection in 3D CT relies heavily on large-scale pixel-level annotations, suffers from poor scalability, and lacks fine-grained controllability in synthesis. To address these limitations, this paper introduces the first text-controllable 3D CT lesion synthesis framework. Our approach employs a structured lesion report template-driven two-stage diffusion model (LMNet + LINet), integrating a text encoder, 3D U-Net backbone, lesion-attribute conditional guidance, and cross-modal feature alignment. The framework enables fine-grained, organ- and lesion-type-agnostic controllable synthesis and jointly generates lesion images with precise segmentation masks. Evaluated across 14 lesion categories spanning eight anatomical organs, our method significantly improves downstream segmentation performance and demonstrates zero-shot generalization capability—outperforming state-of-the-art approaches.

Technology Category

Application Category

📝 Abstract
Fully-supervised lesion recognition methods in medical imaging face challenges due to the reliance on large annotated datasets, which are expensive and difficult to collect. To address this, synthetic lesion generation has become a promising approach. However, existing models struggle with scalability, fine-grained control over lesion attributes, and the generation of complex structures. We propose LesionDiffusion, a text-controllable lesion synthesis framework for 3D CT imaging that generates both lesions and corresponding masks. By utilizing a structured lesion report template, our model provides greater control over lesion attributes and supports a wider variety of lesion types. We introduce a dataset of 1,505 annotated CT scans with paired lesion masks and structured reports, covering 14 lesion types across 8 organs. LesionDiffusion consists of two components: a lesion mask synthesis network (LMNet) and a lesion inpainting network (LINet), both guided by lesion attributes and image features. Extensive experiments demonstrate that LesionDiffusion significantly improves segmentation performance, with strong generalization to unseen lesion types and organs, outperforming current state-of-the-art models. Code will be available at https://github.com/HengruiTianSJTU/LesionDiffusion.
Problem

Research questions and friction points this paper is trying to address.

Overcomes reliance on large annotated medical datasets
Enables fine-grained control over lesion attributes
Generates complex lesion structures in 3D CT imaging
Innovation

Methods, ideas, or system contributions that make the work stand out.

Text-controllable lesion synthesis framework
Structured lesion report template usage
Lesion mask and inpainting networks integration
🔎 Similar Papers
No similar papers found.
H
Henrui Tian
Shanghai Jiaotong University
Wenhui Lei
Wenhui Lei
University of Pennsylvania
AI4HealthArtifical Intelligence
Linrui Dai
Linrui Dai
Ph.D. @ the University of Tokyo
Medical Image Analysis3D Generation3D ReconstructionMultimodal Learning
H
Hanyu Chen
The First Hospital of China Medical University
X
Xiaofan Zhang
Shanghai Artificial Intelligence Laboratory, University of Washington