Scaling Artificial Intelligence for Multi-Tumor Early Detection with More Reports, Fewer Masks

📅 2025-10-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Early multi-tumor detection in CT scans is hindered by reliance on labor-intensive pixel-level annotations and poor sensitivity for small lesions. Method: We propose R-Super, the first weakly supervised learning framework for tumor segmentation that requires only off-the-shelf clinical radiology reports—no pixel-level masks. It leverages natural language processing to extract anatomical and pathological descriptions from reports, generating spatially grounded cross-modal alignment cues between text and image. Results: Trained on 100,000 real-world reports, R-Super matches the performance of a fully supervised baseline trained on 723 meticulously annotated cases. With minimal additional mask supervision (≤50 samples), sensitivity and specificity improve by 13% and 8%, respectively. The framework generalizes successfully to six organs—including spleen, gallbladder, and prostate—with no prior public segmentation models. Across seven tumor types, it outperforms radiologists on five, substantially reducing annotation burden and enabling scalable, cost-effective multi-tumor early screening.

Technology Category

Application Category

📝 Abstract
Early tumor detection save lives. Each year, more than 300 million computed tomography (CT) scans are performed worldwide, offering a vast opportunity for effective cancer screening. However, detecting small or early-stage tumors on these CT scans remains challenging, even for experts. Artificial intelligence (AI) models can assist by highlighting suspicious regions, but training such models typically requires extensive tumor masks--detailed, voxel-wise outlines of tumors manually drawn by radiologists. Drawing these masks is costly, requiring years of effort and millions of dollars. In contrast, nearly every CT scan in clinical practice is already accompanied by medical reports describing the tumor's size, number, appearance, and sometimes, pathology results--information that is rich, abundant, and often underutilized for AI training. We introduce R-Super, which trains AI to segment tumors that match their descriptions in medical reports. This approach scales AI training with large collections of readily available medical reports, substantially reducing the need for manually drawn tumor masks. When trained on 101,654 reports, AI models achieved performance comparable to those trained on 723 masks. Combining reports and masks further improved sensitivity by +13% and specificity by +8%, surpassing radiologists in detecting five of the seven tumor types. Notably, R-Super enabled segmentation of tumors in the spleen, gallbladder, prostate, bladder, uterus, and esophagus, for which no public masks or AI models previously existed. This study challenges the long-held belief that large-scale, labor-intensive tumor mask creation is indispensable, establishing a scalable and accessible path toward early detection across diverse tumor types. We plan to release our trained models, code, and dataset at https://github.com/MrGiovanni/R-Super
Problem

Research questions and friction points this paper is trying to address.

Reducing reliance on costly manual tumor masks for AI training
Leveraging abundant medical reports to scale tumor detection AI
Enabling multi-tumor segmentation where no prior models existed
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses medical reports instead of manual tumor masks
Trains AI to segment tumors matching report descriptions
Enables scalable multi-tumor detection with fewer masks
🔎 Similar Papers
No similar papers found.
P
P. Bassi
Johns Hopkins University, Baltimore, MD, USA.
X
Xinze Zhou
Johns Hopkins University, Baltimore, MD, USA.
Wenxuan Li
Wenxuan Li
Johns Hopkins University
Imaging InformaticsComputer-aided Diagnosis
Szymon Płotka
Szymon Płotka
Jagiellonian University
Machine LearningDeep LearningComputer VisionMedical Imaging
Jieneng Chen
Jieneng Chen
Johns Hopkins University
computer visionworld modelshealthrobotics
Q
Qi Chen
Johns Hopkins University, Baltimore, MD, USA.
Z
Zheren Zhu
University of California, Berkeley, CA, USA.
I
Ibrahim E. Hamaci
University of Zurich, Zurich, Switzerland.
S
Sezgin Er
Istanbul Medipol University, Istanbul, Turkey.
Y
Yuhan Wang
University of California, Santa Cruz, CA, USA.
Ashwin Kumar
Ashwin Kumar
Washington University in St Louis
Reinforcement LearningResource AllocationFairnessRide-sharingExplainable AI Planning
B
Bjoern H Menze
Istanbul Medipol University, Istanbul, Turkey.
J
Jarosław B. Ćwikła
University of Warmia and Mazury, Olsztyn, Poland.
Yuyin Zhou
Yuyin Zhou
Assistant Professor, Computer Science and Engineering, Genomics Institute, UC Santa Cruz
medical image analysismachine learningcomputer visionAI in healthcare
A
Akshay Chaudhari
Stanford University, Stanford, CA, USA.
Curtis P. Langlotz
Curtis P. Langlotz
Professor of Radiology, Medicine, and Biomedical Data Science, Stanford University
machine learningcomputer visionnatural language processingdecision support systemstechnology assessment
Sergio Decherchi
Sergio Decherchi
Facility Coordinator, Fondazione Istituto Italiano di Tecnologia
machine learninghigh performance computingcomputational chemistryapplied math
Andrea Cavalli
Andrea Cavalli
Director, CECAM-EPFL - Professor, University of Bologna
Molecular DynamicsComputational ChemistryDrug DiscoveryCancerAlzheimer's disease
K
Kang Wang
University of California, San Francisco, CA, USA.
Y
Yang Yang
University of California, San Francisco, CA, USA.
A
Alan L. Yuille
Johns Hopkins University, Baltimore, MD, USA.
Z
Zongwei Zhou
Johns Hopkins University, Baltimore, MD, USA.