CLIP Based Region-Aware Feature Fusion for Automated BBPS Scoring in Colonoscopy Images

📅 2025-12-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the clinical challenge of subjectivity and substantial inter-observer variability in Boston Bowel Preparation Scale (BBPS) assessment during colonoscopy, this paper proposes an end-to-end automatic BBPS scoring method that eliminates the need for explicit fecal segmentation. Methodologically, we introduce a novel Adapter-based fine-tuning strategy for the CLIP model to jointly encode global visual features and fecal-relevant textual priors; additionally, we incorporate a lightweight region-aware fecal feature branch to enable text-guided multimodal feature collaboration. This design overcomes limitations of conventional approaches relying on handcrafted features or pixel-level segmentation. Evaluated on our proprietary dataset of 2,240 colonoscopy images and the public NERTHU dataset, our method significantly outperforms existing state-of-the-art methods, achieving inter-rater agreement levels clinically acceptable for routine use and demonstrating strong potential for real-world deployment.

Technology Category

Application Category

📝 Abstract
Accurate assessment of bowel cleanliness is essential for effective colonoscopy procedures. The Boston Bowel Preparation Scale (BBPS) offers a standardized scoring system but suffers from subjectivity and inter-observer variability when performed manually. In this paper, to support robust training and evaluation, we construct a high-quality colonoscopy dataset comprising 2,240 images from 517 subjects, annotated with expert-agreed BBPS scores. We propose a novel automated BBPS scoring framework that leverages the CLIP model with adapter-based transfer learning and a dedicated fecal-feature extraction branch. Our method fuses global visual features with stool-related textual priors to improve the accuracy of bowel cleanliness evaluation without requiring explicit segmentation. Extensive experiments on both our dataset and the public NERTHU dataset demonstrate the superiority of our approach over existing baselines, highlighting its potential for clinical deployment in computer-aided colonoscopy analysis.
Problem

Research questions and friction points this paper is trying to address.

Automates Boston Bowel Preparation Scale scoring to reduce subjectivity
Fuses global visual and stool-related textual features for cleanliness assessment
Eliminates need for explicit segmentation in colonoscopy image analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

CLIP model with adapter-based transfer learning
Fusion of global visual and stool-related textual features
Automated BBPS scoring without explicit segmentation