Define latent spaces by example: optimisation over the outputs of generative models

📅 2025-09-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of efficiently reconciling probabilistic fidelity with task-specific constraints in generative model outputs. We propose a training-free, non-parametric, example-driven method that constructs a low-dimensional, Euclidean, and interpretable latent space directly within the output manifold of any pre-trained generative model—supporting multimodal data including images, audio, video, and protein sequences. Crucially, our approach requires zero additional training or computational overhead: it defines a coordinate system solely from a small set of user-provided examples and enables controllable generation via standard optimization algorithms. The method is model-agnostic, compatible with diverse architectures such as diffusion models and flow-matching models. Experiments demonstrate substantial improvements in constraint satisfaction rate, flexibility, and practical usability across applications including experimental design and creative generation.

Technology Category

Application Category

📝 Abstract
Modern generative AI models such as diffusion and flow matching can sample from rich data distributions, but many downstream tasks -- such as experimental design or creative content generation -- require a higher level of control than unconstrained sampling. The challenge is to efficiently identify outputs that are both probable under the model and satisfy task-specific constraints. We address this by introducing surrogate latent spaces: non-parametric, low-dimensional Euclidean embeddings that can be extracted from any generative model without additional training. The axes in the Euclidean space can be defined via examples, providing a simple and interpretable approach to define custom latent spaces that both express intended features and are convenient to use in downstream tasks. The representation is Euclidean and has controllable dimensionality, permitting direct application of standard optimisation algorithms to traverse the outputs of generative models. Our approach is architecture-agnostic, incurs almost no additional computational cost, and generalises across modalities, including images, audio, videos, and structured objects like proteins.
Problem

Research questions and friction points this paper is trying to address.

Defining custom latent spaces through examples for generative models
Optimizing outputs to satisfy task-specific constraints and probability
Creating architecture-agnostic embeddings for multimodal content control
Innovation

Methods, ideas, or system contributions that make the work stand out.

Defines latent spaces via examples without retraining models
Uses Euclidean embeddings for interpretable feature control
Enables standard optimization across multimodal generative outputs
🔎 Similar Papers
No similar papers found.
S
Samuel Willis
University of Cambridge
A
Alexandru I. Stere
Boeing Commercial Airplanes
D
Dragos D. Margineantu
Boeing AI
H
Henry T. Oldroyd
Lancaster University
J
John A. Fozard
Lancaster University
Carl Henrik Ek
Carl Henrik Ek
University of Cambridge
Machine Learning
H
Henry Moss
University of Cambridge, Lancaster University
E
Erik Bodin
University of Cambridge