Discovery and Expansion of New Domains within Diffusion Models

๐Ÿ“… 2023-10-13
๐Ÿ“ˆ Citations: 1
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work investigates zero-shot domain generalization of diffusion modelsโ€”i.e., high-fidelity cross-domain generation on unseen target domains without fine-tuning the pre-trained model and using only a few target-domain images. We propose a fine-tuning-free paradigm: latent-space inversion via bidirectional deterministic diffusion/denoising trajectories to locate and activate separable out-of-distribution (OOD) Gaussian prior modes inherently embedded in the modelโ€™s latent space. Our method requires no gradient updates, relying solely on gradient-free latent search. Extensive experiments across multiple diffusion models and diverse domains demonstrate substantial improvements in cross-domain generation quality while preserving original-domain fidelity. This is the first study to reveal that single-domain-trained denoising diffusion probabilistic models (DDPMs) implicitly encode cross-domain representational capacity and OOD latent separability. We further validate its scientific utility by applying it to sparse astrophysical data generation.
๐Ÿ“ Abstract
In this work, we study the generalization properties of diffusion models in a few-shot setup, introduce a novel tuning-free paradigm to synthesize the target out-of-domain (OOD) data, and demonstrate its advantages compared to existing methods in data-sparse scenarios with large domain gaps. Specifically, given a pre-trained model and a small set of images that are OOD relative to the model's training distribution, we explore whether the frozen model is able to generalize to this new domain. We begin by revealing that Denoising Diffusion Probabilistic Models (DDPMs) trained on single-domain images are already equipped with sufficient representation abilities to reconstruct arbitrary images from the inverted latent encoding following bi-directional deterministic diffusion and denoising trajectories. We then demonstrate through both theoretical and empirical perspectives that the OOD images establish Gaussian priors in latent spaces of the given model, and the inverted latent modes are separable from their initial training domain. We then introduce our novel tuning-free paradigm to synthesize new images of the target unseen domain by discovering qualified OOD latent encodings in the inverted noisy spaces. This is fundamentally different from the current paradigm that seeks to modify the denoising trajectory to achieve the same goal by tuning the model parameters. Extensive cross-model and domain experiments show that our proposed method can expand the latent space and generate unseen images via frozen DDPMs without impairing the quality of generation of their original domain. We also showcase a practical application of our proposed heuristic approach in dramatically different domains using astrophysical data, revealing the great potential of such a generalization paradigm in data spare fields such as scientific explorations.
Problem

Research questions and friction points this paper is trying to address.

Investigates domain generalization in diffusion models for image synthesis.
Proposes sampling-based method to generate unseen domain images without fine-tuning.
Demonstrates latent space expansion for OOD images while preserving quality.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Frozen pre-trained diffusion models for generalization
Sampling-based OOD latent encodings without fine-tuning
Expanding latent space to generate unseen domain images
๐Ÿ”Ž Similar Papers
No similar papers found.
Y
Ye Zhu
Department of Computer Science, Princeton University, USA
Yu Wu
Yu Wu
University of Cambridge
machine learninghealth sensingmobile health
D
Duo Xu
Canadian Institute for Theoretical Astrophysics (CITA), University of Toronto, Canada
Zhiwei Deng
Zhiwei Deng
Google DeepMind, Princeton University
Computer VisionMachine LearningDeep Learning
Y
Yan Yan
Department of Computer Science, University of Illinois Chicago, USA
Olga Russakovsky
Olga Russakovsky
Associate Professor, Princeton University
Computer vision