Causal Disentanglement for Robust Long-tail Medical Image Generation

📅 2025-04-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address poor generation quality, limited diversity, and weak clinical interpretability of minority classes in medical imaging under long-tailed class distributions, this paper proposes a causal-disentanglement-driven dual-stream diffusion generative framework—separately modeling pathological and anatomical information. We employ causal representation learning to disentangle pathology- and anatomy-specific features, enforce feature independence via group-wise supervision, and integrate large language models to parse lesion location and severity from clinical reports for precise, text-guided pathological modeling. Additionally, we introduce an initial latent noise optimization strategy to enhance structural stability. Evaluated on multiple long-tailed medical imaging datasets, our method achieves a 23.6% reduction in Fréchet Inception Distance (FID), an 18.4% improvement in classification accuracy, and a 2.1× increase in minority-class generation diversity—significantly improving pathological plausibility, anatomical fidelity, and clinical credibility of generated images.

Technology Category

Application Category

📝 Abstract
Counterfactual medical image generation effectively addresses data scarcity and enhances the interpretability of medical images. However, due to the complex and diverse pathological features of medical images and the imbalanced class distribution in medical data, generating high-quality and diverse medical images from limited data is significantly challenging. Additionally, to fully leverage the information in limited data, such as anatomical structure information and generate more structurally stable medical images while avoiding distortion or inconsistency. In this paper, in order to enhance the clinical relevance of generated data and improve the interpretability of the model, we propose a novel medical image generation framework, which generates independent pathological and structural features based on causal disentanglement and utilizes text-guided modeling of pathological features to regulate the generation of counterfactual images. First, we achieve feature separation through causal disentanglement and analyze the interactions between features. Here, we introduce group supervision to ensure the independence of pathological and identity features. Second, we leverage a diffusion model guided by pathological findings to model pathological features, enabling the generation of diverse counterfactual images. Meanwhile, we enhance accuracy by leveraging a large language model to extract lesion severity and location from medical reports. Additionally, we improve the performance of the latent diffusion model on long-tailed categories through initial noise optimization.
Problem

Research questions and friction points this paper is trying to address.

Generate diverse medical images from limited data
Disentangle pathological and structural features causally
Improve interpretability and clinical relevance of generated images
Innovation

Methods, ideas, or system contributions that make the work stand out.

Causal disentanglement for feature separation
Text-guided pathological feature modeling
Noise optimization for long-tail categories
🔎 Similar Papers