Latent Bayesian Optimization via Autoregressive Normalizing Flows

📅 2025-04-21

📈 Citations: 1

✨ Influential: 0

career value

182K/year

🤖 AI Summary

Existing latent Bayesian optimization (LBO) methods for high-dimensional structured data—such as molecular sequences—suffer from severe performance degradation due to reconstruction errors in encoder-decoder mappings, causing misalignment between latent-space and input-space objective values. To address this, we propose FlowBO: (1) it employs invertible normalizing flows to construct exact, bijective encoder-decoder mappings, eliminating reconstruction bias entirely; (2) it introduces autoregressive normalizing flows (SeqFlow) into LBO for the first time, jointly modeling sequential structure and ensuring probabilistic invertibility; and (3) it designs a dynamic exploration sampling strategy based on token-wise importance to enhance query efficiency in the latent space. Evaluated on molecular generation tasks, FlowBO significantly outperforms both classical and state-of-the-art LBO approaches: optimization convergence accelerates by 32%–57%, while simultaneously improving both validity and diversity of generated molecules.

Technology Category

Application Category

📝 Abstract

Bayesian Optimization (BO) has been recognized for its effectiveness in optimizing expensive and complex objective functions. Recent advancements in Latent Bayesian Optimization (LBO) have shown promise by integrating generative models such as variational autoencoders (VAEs) to manage the complexity of high-dimensional and structured data spaces. However, existing LBO approaches often suffer from the value discrepancy problem, which arises from the reconstruction gap between input and latent spaces. This value discrepancy problem propagates errors throughout the optimization process, leading to suboptimal outcomes. To address this issue, we propose a Normalizing Flow-based Bayesian Optimization (NF-BO), which utilizes normalizing flow as a generative model to establish one-to-one encoding function from the input space to the latent space, along with its left-inverse decoding function, eliminating the reconstruction gap. Specifically, we introduce SeqFlow, an autoregressive normalizing flow for sequence data. In addition, we develop a new candidate sampling strategy that dynamically adjusts the exploration probability for each token based on its importance. Through extensive experiments, our NF-BO method demonstrates superior performance in molecule generation tasks, significantly outperforming both traditional and recent LBO approaches.

Problem

Research questions and friction points this paper is trying to address.

Addresses value discrepancy in Latent Bayesian Optimization

Uses normalizing flow for one-to-one encoding to eliminate reconstruction gap

Improves optimization in high-dimensional structured data spaces

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses normalizing flow for one-to-one encoding

Introduces SeqFlow for sequence data

Dynamic token exploration probability adjustment

🔎 Similar Papers

Elucidating the Design Choice of Probability Paths in Flow Matching for Forecasting

2024-10-04arXiv.orgCitations: 0

Optimizing Time Series Forecasting Architectures: A Hierarchical Neural Architecture Search Approach

2024-06-07arXiv.orgCitations: 0

Bosch Group

Renningen, BW, DE

AI Research Scientist, CoreML - Monetization AI