Variational Search Distributions

📅 2024-09-10
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the discrete combinatorial design generation problem for rare, functionally targeted molecules—such as functional protein or nucleic acid sequences—under constraints of expensive black-box evaluations (e.g., wet-lab experiments or high-fidelity simulations) and batched query requirements. We propose the Variational Sequence Design (VSD) framework, which formally characterizes the theoretical requirements and convergence criteria for active generative design. VSD constructs a differentiable conditional generative model via variational inference, enabling end-to-end gradient-based optimization jointly with scalable surrogate predictors. Theoretically, VSD guarantees asymptotic convergence to high-functionality rare designs. Empirically, it achieves superior performance over state-of-the-art baselines across multiple real-world bioengineering tasks, efficiently and robustly discovering rare, high-performing sequence variants.

Technology Category

Application Category

📝 Abstract
We develop VSD, a method for conditioning a generative model of discrete, combinatorial designs on a rare desired class by efficiently evaluating a black-box (e.g. experiment, simulation) in a batch sequential manner. We call this task active generation; we formalize active generation's requirements and desiderata, and formulate a solution via variational inference. VSD uses off-the-shelf gradient based optimization routines, can learn powerful generative models for desirable designs, and can take advantage of scalable predictive models. We derive asymptotic convergence rates for learning the true conditional generative distribution of designs with certain configurations of our method. After illustrating the generative model on images, we empirically demonstrate that VSD can outperform existing baseline methods on a set of real sequence-design problems in various protein and DNA/RNA engineering tasks.
Problem

Research questions and friction points this paper is trying to address.

Develops VSD for conditioning generative models on rare classes.
Formalizes active generation via variational inference and optimization.
Demonstrates VSD's superiority in protein and DNA/RNA engineering tasks.
Innovation

Methods, ideas, or system contributions that make the work stand out.

VSD method conditions generative models on rare classes
Uses variational inference for active generation tasks
Applies gradient-based optimization for scalable model learning
🔎 Similar Papers
No similar papers found.