🤖 AI Summary
Medical image segmentation suffers from heterogeneous biases—including label noise and inconsistent annotation styles—that severely degrade model robustness. To address this, we propose a hypernetwork-based self-organizing robust training framework that, for the first time, jointly learns a latent space and hypernetwork to dynamically generate U-Net parameters tailored to sample-level image–label variability. By modeling the multimodal distribution of U-Net parameters in the latent space, low-density regions capture noise patterns, while high-density regions enable robust organ segmentation; moreover, the framework enables interpretable identification of systematic biases and erroneous samples. Evaluated on AMOS (with synthetic perturbations) and TotalSegmentator (exhibiting real-world unknown biases), our method achieves bias-aware structured data mapping: latent-space clustering exhibits strong correspondence with semantically consistent segmentation behaviors. Code and bias analysis tools are publicly available.
📝 Abstract
Medical imaging datasets often contain heterogeneous biases ranging from erroneous labels to inconsistent labeling styles. Such biases can negatively impact deep segmentation networks performance. Yet, the identification and characterization of such biases is a particularly tedious and challenging task. In this paper, we introduce HyperSORT, a framework using a hyper-network predicting UNets' parameters from latent vectors representing both the image and annotation variability. The hyper-network parameters and the latent vector collection corresponding to each data sample from the training set are jointly learned. Hence, instead of optimizing a single neural network to fit a dataset, HyperSORT learns a complex distribution of UNet parameters where low density areas can capture noise-specific patterns while larger modes robustly segment organs in differentiated but meaningful manners. We validate our method on two 3D abdominal CT public datasets: first a synthetically perturbed version of the AMOS dataset, and TotalSegmentator, a large scale dataset containing real unknown biases and errors. Our experiments show that HyperSORT creates a structured mapping of the dataset allowing the identification of relevant systematic biases and erroneous samples. Latent space clusters yield UNet parameters performing the segmentation task in accordance with the underlying learned systematic bias. The code and our analysis of the TotalSegmentator dataset are made available: https://github.com/ImFusionGmbH/HyperSORT