A comprehensive multimodal dataset and benchmark for ulcerative colitis scoring in endoscopy

📅 2026-03-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the critical gap in ulcerative colitis endoscopic assessment research, which has been hindered by the absence of publicly available, multicenter, multimodal datasets with expert annotations and reliable benchmarks. To this end, the authors construct the first multicenter endoscopic image dataset integrating both Mayo Endoscopic Subscore (MES) and Ulcerative Colitis Endoscopic Index of Severity (UCEIS) scoring systems alongside expert-generated clinical descriptions, encompassing multiple image resolutions. They systematically evaluate the performance of convolutional neural networks, vision Transformers, hybrid architectures, and state-of-the-art vision–language models on endoscopic score classification and image captioning tasks. This work establishes a unified benchmark that supports both automated severity scoring and interpretable semantic description generation, thereby advancing research into algorithmic robustness and clinical interpretability in inflammatory bowel disease assessment.

Technology Category

Application Category

📝 Abstract
Ulcerative colitis (UC) is a chronic mucosal inflammatory condition that places patients at increased risk of colorectal cancer. Colonoscopic surveillance remains the gold standard for assessing disease activity, and reporting typically relies on standardised endoscopic scoring metrics. The most widely used is the Mayo Endoscopic Score (MES), with some centres also adopting the Ulcerative Colitis Endoscopic Index of Severity (UCEIS). Both are descriptive assessments of mucosal inflammation (MES: 0 to 3; UCEIS: 0 to 8), where higher values indicate more severe disease. However, computational methods for automatically predicting these scores remain limited, largely due to the lack of publicly available expert-annotated datasets and the absence of robust benchmarking. There is also a significant research gap in generating clinically meaningful descriptions of UC images, despite image captioning being a well-established computer vision task. Variability in endoscopic systems and procedural workflows across centres further highlights the need for multi-centre datasets to ensure algorithmic robustness and generalisability. In this work, we introduce a curated multi-centre, multi-resolution dataset that includes expert-validated MES and UCEIS labels, alongside detailed clinical descriptions. To our knowledge, this is the first comprehensive dataset that combines dual scoring metrics for classification tasks with expert-generated captions describing mucosal appearance and clinically accepted reasoning for image captioning. This resource opens new opportunities for developing clinically meaningful multimodal algorithms. In addition to the dataset, we also provide benchmarking using convolutional neural networks, vision transformers, hybrid models, and widely used multimodal vision-language captioning algorithms.
Problem

Research questions and friction points this paper is trying to address.

ulcerative colitis
endoscopic scoring
multimodal dataset
image captioning
benchmark
Innovation

Methods, ideas, or system contributions that make the work stand out.

multimodal dataset
ulcerative colitis scoring
endoscopic image captioning
multi-center benchmark
vision-language models
🔎 Similar Papers
No similar papers found.
Noha Ghatwary
Noha Ghatwary
Arab Academy For Science, and Technology
Image ProcessingMedical Image ProcessingArtificial IntelligenceVideo Analysis
Jiangbei Yue
Jiangbei Yue
University of Leeds
Computer VisionComputer GraphicsDifferentiable Physics
A
Ahmed Elgendy
Department of Electrical and Computer Engineering, Queen’s University, Canada
H
Hanna Nagdy
Internal Medicine Department, College of Medicine, Arab Academy for Science and Technology, Egypt
A
Ahmed Galal
Internal medicine, Alexandria University, Egypt
H
Hayam Fathy
Internal medicine, Division Hepatogastroenterology, Assiut university, Egypt
H
Hussein El-Amin
Internal medicine, Division Hepatogastroenterology, Assiut university, Egypt
V
Venkataraman Subramanian
Leeds Teaching Hospital NHS Trust, Leeds, United Kingdom
Noor Mohammed
Noor Mohammed
PhD in Electrical and Computer Engineering, University of Massachusetts Amherst
Wireless Power TransferWearable SensorsRFIDComputational ModelingRF Circuit
Gilberto Ochoa-Ruiz
Gilberto Ochoa-Ruiz
Tec de Monterrey, CV-inside lab, Advanced AI Research Group
Endoscopic ImagingComputer VisionMedical Image ComputingImage-guided SurgeryXAI
Sharib Ali
Sharib Ali
University of Leeds, School of Computer Science
Medical Image AnalysisCancer diagnosisSurgical data scienceImage-Guided Surgeryvision