🤖 AI Summary
This study addresses the automatic recognition of multiple abnormalities (e.g., bleeding, ulcers, polyps) in small-bowel video capsule endoscopy. We establish the first fine-grained, multi-abnormality classification benchmark for real-world clinical videos, featuring a standardized evaluation protocol and a publicly available, expert-annotated dataset. Methodologically, we propose an end-to-end weakly supervised temporal modeling framework that jointly performs frame-level feature extraction, video segment aggregation, and multi-label classification—balancing generalizability and interpretability. The associated challenge attracted 37 international teams; the top-performing solution achieved a mean accuracy of 89.6%, outperforming the baseline by 12.3%. This work provides the first empirical validation of multi-abnormality co-modeling in capsule video analysis. It delivers a reproducible, rigorously evaluated technical paradigm and foundational data resource to advance AI-driven early screening for gastrointestinal disorders.
📝 Abstract
We present the Capsule Vision 2024 Challenge: Multi-Class Abnormality Classification for Video Capsule Endoscopy. It was virtually organized by the Research Center for Medical Image Analysis and Artificial Intelligence (MIAAI), Department of Medicine, Danube Private University, Krems, Austria in collaboration with the 9th International Conference on Computer Vision&Image Processing (CVIP 2024) being organized by the Indian Institute of Information Technology, Design and Manufacturing (IIITDM) Kancheepuram, Chennai, India. This document provides an overview of the challenge, including the registration process, rules, submission format, description of the datasets used, qualified team rankings, all team descriptions, and the benchmarking results reported by the organizers.