MEGAN: Mixture of Experts for Robust Uncertainty Estimation in Endoscopy Videos

📅 2025-09-16

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

In medical AI, inter-annotator variability is often overlooked, compromising the reliability of uncertainty quantification (UQ). To address this, we propose a Multi-Expert Gated Network for endoscopic video assessment in ulcerative colitis—grounded in Evidential Deep Learning (EDL). The framework trains multiple expert models, each tailored to distinct annotation strategies, and dynamically fuses their predictions and uncertainties via a learnable gating mechanism, explicitly modeling rater heterogeneity. This improves predictive calibration and enables uncertainty-driven sample stratification to reduce annotation burden. Evaluated on prospective clinical trial data scored using the Mayo Endoscopic Subscore (MES), our method outperforms Monte Carlo Dropout and deep ensembles: it achieves a 3.5% absolute improvement in F1-score and reduces Expected Calibration Error by 30.5%, markedly enhancing both predictive reliability and inter-rater consistency.

Technology Category

Application Category

📝 Abstract

Reliable uncertainty quantification (UQ) is essential in medical AI. Evidential Deep Learning (EDL) offers a computationally efficient way to quantify model uncertainty alongside predictions, unlike traditional methods such as Monte Carlo (MC) Dropout and Deep Ensembles (DE). However, all these methods often rely on a single expert's annotations as ground truth for model training, overlooking the inter-rater variability in healthcare. To address this issue, we propose MEGAN, a Multi-Expert Gating Network that aggregates uncertainty estimates and predictions from multiple AI experts via EDL models trained with diverse ground truths and modeling strategies. MEGAN's gating network optimally combines predictions and uncertainties from each EDL model, enhancing overall prediction confidence and calibration. We extensively benchmark MEGAN on endoscopy videos for Ulcerative colitis (UC) disease severity estimation, assessed by visual labeling of Mayo Endoscopic Subscore (MES), where inter-rater variability is prevalent. In large-scale prospective UC clinical trial, MEGAN achieved a 3.5% improvement in F1-score and a 30.5% reduction in Expected Calibration Error (ECE) compared to existing methods. Furthermore, MEGAN facilitated uncertainty-guided sample stratification, reducing the annotation burden and potentially increasing efficiency and consistency in UC trials.

Problem

Research questions and friction points this paper is trying to address.

Addressing inter-rater variability in medical AI uncertainty estimation

Aggregating predictions from multiple experts with diverse annotations

Improving uncertainty calibration for endoscopic disease severity assessment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Expert Gating Network combines predictions

Aggregates uncertainty estimates from multiple AI experts

Uses Evidential Deep Learning with diverse truths

🔎 Similar Papers

No similar papers found.