GenMed: A Pairwise Generative Reformulation of Medical Diagnostic Tasks

📅 2026-05-11

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

This work addresses the limited generalizability of traditional medical AI systems, which rely on discriminative models and struggle with the heterogeneous, sparse, or multimodal data commonly encountered in real-world clinical settings. The authors propose the first generative paradigm tailored for medical diagnosis, leveraging diffusion models to learn the joint distribution $P(X,Y)$ of inputs and outputs. Inference is reformulated as a gradient-guided optimization problem at test time, enabling flexible reasoning over arbitrary combinations of observed variables without requiring architectural modifications or retraining. The approach demonstrates strong performance across diverse tasks—including cross-modal segmentation, few-shot (2–4 cases) segmentation, handling degraded inputs, sparse shape completion, and zero-shot scenarios. To further advance general-purpose medical AI, the study also introduces MedShapeNet, a large-scale text-to-shape dataset.

📝 Abstract

Data-driven medical AI is traditionally formulated as a discriminative mapping from input $X$ to output $Y$ via a learned function $f$, which does not generalize well across heterogeneous data and modalities encountered in real-world clinical settings. In this work, we propose a fundamentally different, generative paradigm. We model the joint distribution $P(X,Y)$ using diffusion models and reframe inference as a test-time output optimization problem. By guiding the generative process to match observed inputs, our framework enables flexible, gradient-based conditioning at inference time without architectural changes or retraining, effectively supporting arbitrary and previously unseen combinations of observations. Extensive experiments demonstrate strong performance across standard and cross-modality medical image segmentation, few-shot segmentation with only 2 or 4 training samples, degraded-input segmentation, shape completion from sparse and partial observations, and zero-shot application to demonstrate generality. To support these evaluations, we curated and released a large-scale text-shape dataset derived from MedShapeNet. Our results highlight the versatility of generative joint modeling as a foundation for reusable, task-agnostic medical AI systems.

Problem

Research questions and friction points this paper is trying to address.

medical AI

generalization

heterogeneous data

multimodal learning

discriminative modeling

Innovation

Methods, ideas, or system contributions that make the work stand out.

generative modeling

diffusion models

test-time optimization