A Multimodal Head and Neck Cancer Dataset for AI-Driven Precision Oncology

📅 2025-08-30
📈 Citations: 0
✹ Influential: 0
📄 PDF
đŸ€– AI Summary
A publicly available, multicenter, multimodal head and neck cancer (HNC) dataset is lacking, hindering AI development for tumor segmentation, recurrence-free survival (RFS) prediction, and HPV status classification. Method: We introduce HNC-1123—the first open-source, multicenter PET/CT dataset comprising 1,123 patients from 10 international centers, with expert-annotated tumor segmentations, radiotherapy dose maps, and longitudinal clinical follow-up metadata. It represents the first standardized integration and high-fidelity annotation of heterogeneous multicenter PET/CT data. All imaging data are anonymized in NIfTI format. Using UNet, SegResNet, and multimodal prognostic models, we perform end-to-end analysis. Results: The dataset achieves robust benchmark performance: Dice >0.82 for automatic segmentation, C-index = 0.74 for RFS prediction, and AUC = 0.89 for HPV classification—significantly advancing AI-driven precision radiotherapy and prognostic modeling in HNC.

Technology Category

Application Category

📝 Abstract
We describe a publicly available multimodal dataset of annotated Positron Emission Tomography/Computed Tomography (PET/CT) studies for head and neck cancer research. The dataset includes 1123 FDG-PET/CT studies from patients with histologically confirmed head and neck cancer, acquired from 10 international medical centers. All examinations consisted of co-registered PET/CT scans with varying acquisition protocols, reflecting real-world clinical diversity across institutions. Primary gross tumor volumes (GTVp) and involved lymph nodes (GTVn) were manually segmented by experienced radiation oncologists and radiologists following standardized guidelines and quality control measures. We provide anonymized NifTi files of all studies, along with expert-annotated segmentation masks, radiotherapy dose distribution for a subset of patients, and comprehensive clinical metadata. This metadata includes TNM staging, HPV status, demographics (age and gender), long-term follow-up outcomes, survival times, censoring indicators, and treatment information. We demonstrate how this dataset can be used for three key clinical tasks: automated tumor segmentation, recurrence-free survival prediction, and HPV status classification, providing benchmark results using state-of-the-art deep learning models, including UNet, SegResNet, and multimodal prognostic frameworks.
Problem

Research questions and friction points this paper is trying to address.

Creating a multimodal PET/CT dataset for head and neck cancer research
Providing expert-annotated tumor segmentations and clinical metadata
Enabling AI applications for segmentation and survival prediction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Public multimodal PET/CT cancer dataset
Expert-annotated tumor segmentation masks
Deep learning models for clinical tasks
🔎 Similar Papers
No similar papers found.
Numan Saeed
Numan Saeed
Mohamed Bin Zayed University of Artificial Intelligence
AI for MedicineMedical ImagingMachine Learning
S
Salma Hassan
Department of Machine Learning, Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE
Shahad Hardan
Shahad Hardan
PhD in Machine Learning
AI in Healthcare
Ahmed Aly
Ahmed Aly
MBZUAI
Computer Vision
D
Darya Taratynova
Department of Machine Learning, Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE
U
Umair Nawaz
Department of Computer Vision, Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE
U
Ufaq Khan
Department of Computer Vision, Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE
Muhammad Ridzuan
Muhammad Ridzuan
Mohamed bin Zayed University of Artificial Intelligence
AI for HealthcareMachine LearningDeep LearningComputer VisionGeology
T
Thomas Eugene
Nantes Université, CHU Nantes, Nuclear Medicine Department, Nantes, France
R
Raphaël Metz
Nantes Université, CHU Nantes, Nuclear Medicine Department, Nantes, France
M
Mélanie Dore
Radiation Oncology Department, Institut de CancĂ©rologie de l’Ouest, Saint-Herblain, France
G
Gregory Delpon
Medical Physics Department, Institut de CancĂ©rologie de l’Ouest, Saint Herblain, France
V
Vijay Ram Kumar Papineni
Radiology Department, Sheikh Shakhbout Medical City, Abu Dhabi, UAE
K
Kareem Wahid
MD Anderson Cancer Center, The University of Texas, Texas, United States
C
Cem Dede
MD Anderson Cancer Center, The University of Texas, Texas, United States
A
Alaa Mohamed Shawky Ali
MD Anderson Cancer Center, The University of Texas, Texas, United States
C
Carlos Sjogreen
MD Anderson Cancer Center, The University of Texas, Texas, United States
M
Mohamed Naser
MD Anderson Cancer Center, The University of Texas, Texas, United States
C
Clifton D. Fuller
MD Anderson Cancer Center, The University of Texas, Texas, United States
V
Valentin Oreiller
Institute of Informatics, HES-SO Valais-Wallis University of Applied Sciences and Arts Western Switzerland, Sierre, Switzerland
M
Mario Jreige
Department of Nuclear Medicine and Molecular Imaging, Lausanne University Hospital (CHUV), Rue du Bugnon 46, CH-1011 Lausanne, Switzerland
J
John O. Prior
Department of Nuclear Medicine and Molecular Imaging, Lausanne University Hospital (CHUV), Rue du Bugnon 46, CH-1011 Lausanne, Switzerland
C
Catherine Cheze Le Rest
Centre Hospitalier Universitaire de Poitiers (CHUP), Poitiers, France
O
Olena Tankyevych
Centre Hospitalier Universitaire de Poitiers (CHUP), Poitiers, France
P
Pierre Decazes
Center Henri Becquerel, LITIS laboratory, University of Rouen Normandy, Rouen, France