Med-SegLens: Latent-Level Model Diffing for Interpretable Medical Image Segmentation

📅 2026-02-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited interpretability of medical image segmentation models, which hinders error diagnosis and robustness under data distribution shifts. The authors propose the first latent-level differential framework tailored for medical image segmentation, leveraging sparse autoencoders to extract interpretable latent variables from the internal representations of SegFormer and U-Net. By systematically analyzing representation discrepancies across architectures and datasets, they uncover how shared and population-specific latent factors influence model performance. Building on these insights, they enable causal interventions without retraining, successfully restoring segmentation accuracy in 70% of failure cases—boosting the Dice score from 39.4% to 74.2%—and substantially improving cross-dataset generalization.

Technology Category

Application Category

📝 Abstract
Modern segmentation models achieve strong predictive performance but remain largely opaque, limiting our ability to diagnose failures, understand dataset shift, or intervene in a principled manner. We introduce Med-SegLens, a model-diffing framework that decomposes segmentation model activations into interpretable latent features using sparse autoencoders trained on SegFormer and U-Net. Through cross-architecture and cross-dataset latent alignment across healthy, adult, pediatric, and sub-Saharan African glioma cohorts, we identify a stable backbone of shared representations, while dataset shift is driven by differential reliance on population-specific latents. We show that these latents act as causal bottlenecks for segmentation failures, and that targeted latent-level interventions can correct errors and improve cross-dataset adaption without retraining, recovering performance in 70% of failure cases and improving Dice score from 39.4% to 74.2%. Our results demonstrate that latent-level model diffing provides a practical and mechanistic tool for diagnosing failures and mitigating dataset shift in segmentation models.
Problem

Research questions and friction points this paper is trying to address.

medical image segmentation
model interpretability
dataset shift
failure diagnosis
latent representations
Innovation

Methods, ideas, or system contributions that make the work stand out.

latent-level model diffing
interpretable medical image segmentation
sparse autoencoders
dataset shift mitigation
causal bottleneck
🔎 Similar Papers
No similar papers found.