🤖 AI Summary
Tumor annotation masks for medical image segmentation are severely scarce, hindering model training. Method: This paper proposes Report-Supervised Loss (R-Super), the first method to directly convert unstructured radiology reports into voxel-level weak supervision signals, jointly trained with a small number of precise masks. Leveraging CT-report paired data, R-Super establishes a text-to-voxel semantic mapping via integrated NLP and medical image segmentation techniques—without handcrafted rules or intermediate labels. Contribution/Results: On internal and external validation sets, R-Super achieves up to a 16% improvement in F1 score over mask-only supervised baselines. It demonstrates robust performance across both low- and high-annotation regimes. This work introduces a scalable, end-to-end paradigm for exploiting abundant clinical textual resources to alleviate the annotation bottleneck in medical image segmentation.
📝 Abstract
Tumor segmentation in CT scans is key for diagnosis, surgery, and prognosis, yet segmentation masks are scarce because their creation requires time and expertise. Public abdominal CT datasets have from dozens to a couple thousand tumor masks, but hospitals have hundreds of thousands of tumor CTs with radiology reports. Thus, leveraging reports to improve segmentation is key for scaling. In this paper, we propose a report-supervision loss (R-Super) that converts radiology reports into voxel-wise supervision for tumor segmentation AI. We created a dataset with 6,718 CT-Report pairs (from the UCSF Hospital), and merged it with public CT-Mask datasets (from AbdomenAtlas 2.0). We used our R-Super to train with these masks and reports, and strongly improved tumor segmentation in internal and external validation--F1 Score increased by up to 16% with respect to training with masks only. By leveraging readily available radiology reports to supplement scarce segmentation masks, R-Super strongly improves AI performance both when very few training masks are available (e.g., 50), and when many masks were available (e.g., 1.7K).
Project: https://github.com/MrGiovanni/R-Super