🤖 AI Summary
To address the limitation of conventional generative models (e.g., GANs, diffusion models) — their reliance on large-scale labeled data — in data-scarce domains such as medical imaging and remote sensing, this paper proposes a unified framework termed “Generative Modeling under Data Constraints” (GM-DC). We systematically establish a two-dimensional taxonomy: (i) task dimension—encompassing low-data, few-shot, and zero-shot settings; and (ii) methodological dimension—integrating transfer learning, meta-learning, prompt engineering, and multi-paradigm fusion. This work is the first to uncover cross-paradigm adaptation principles and synergistic mechanisms under data constraints. The survey comprehensively analyzes lightweight designs and knowledge transfer strategies for mainstream architectures—including VAEs, GANs, and diffusion models—and identifies critical research gaps while charting emerging trends. As the inaugural holistic GM-DC survey, it is accompanied by an open-source platform for continuous resource updates, providing both theoretical foundations and practical guidance for data-efficient generative modeling.
📝 Abstract
In machine learning, generative modeling aims to learn to generate new data statistically similar to the training data distribution. In this paper, we survey learning generative models under limited data, few shots and zero shot, referred to as Generative Modeling under Data Constraint (GM-DC). This is an important topic when data acquisition is challenging, e.g. healthcare applications. We discuss background, challenges, and propose two taxonomies: one on GM-DC tasks and another on GM-DC approaches. Importantly, we study interactions between different GM-DC tasks and approaches. Furthermore, we highlight research gaps, research trends, and potential avenues for future exploration. Project website: https://gmdc-survey.github.io.