🤖 AI Summary
This study addresses the challenge of unique polyp identification in colon capsule endoscopy (CCE), where large data volumes and ambiguous annotations hinder reliable recognition. To tackle this, the authors propose a multiple instance learning–based verification framework that matches query images against bags of images. The method integrates a ConvNeXt backbone with Variance-Excited Multi-head Attention (VEMA) and Distance-Based Attention (DBA), further enhanced by SimCLR self-supervised pretraining to improve feature discriminability. This work represents the first application of a multiple instance verification paradigm to unique polyp identification in CCE. Evaluated on a clinical dataset comprising 754 patients and 1,912 polyps, the DBA L1 variant achieves an accuracy of 86.26% and an AUC of 0.928, demonstrating a significant improvement in identification performance.
📝 Abstract
Identifying unique polyps in colon capsule endoscopy (CCE) images is a critical yet challenging task for medical personnel due to the large volume of images, the cognitive load it creates for clinicians, and the ambiguity in labeling specific frames. This paper formulates this problem as a multi-instance learning (MIL) task, where a query polyp image is compared with a target bag of images to determine uniqueness. We employ a multi-instance verification (MIV) framework that incorporates attention mechanisms, such as variance-excited multi-head attention (VEMA) and distance-based attention (DBA), to enhance the model's ability to extract meaningful representations. Additionally, we investigate the impact of self-supervised learning using SimCLR to generate robust embeddings. Experimental results on a dataset of 1912 polyps from 754 patients demonstrate that attention mechanisms significantly improve performance, with DBA L1 achieving the highest test accuracy of 86.26\% and a test AUC of 0.928 using a ConvNeXt backbone with SimCLR pretraining. This study underscores the potential of MIL and self-supervised learning in advancing automated analysis of Colon Capsule Endoscopy images, with implications for broader medical imaging applications.