Shortest-Path Flow Matching with Mixture-Conditioned Bases for OOD Generalization to Unseen Conditions

📅 2026-01-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited out-of-distribution generalization of existing conditional generative models, which struggle to extrapolate effectively to unseen conditioning inputs. To overcome this challenge, the authors propose MixFlow, a novel framework that introduces a learnable descriptor-dependent mixture distribution as the base measure in conditional flow matching for the first time. By integrating shortest-path flow field modeling, MixFlow enables smooth interpolation and robust extrapolation under previously unobserved conditions. Empirical evaluations on tasks such as single-cell transcriptomic response prediction and high-content microscopic drug screening demonstrate that MixFlow significantly outperforms standard conditional flow matching baselines, exhibiting superior cross-domain generative generalization capabilities.

Technology Category

Application Category

📝 Abstract
Robust generalization under distribution shift remains a key challenge for conditional generative modeling: conditional flow-based methods often fit the training conditions well but fail to extrapolate to unseen ones. We introduce SP-FM, a shortest-path flow-matching framework that improves out-of-distribution (OOD) generalization by conditioning both the base distribution and the flow field on the condition. Specifically, SP-FM learns a condition-dependent base distribution parameterized as a flexible, learnable mixture, together with a condition-dependent vector field trained via shortest-path flow matching. Conditioning the base allows the model to adapt its starting distribution across conditions, enabling smooth interpolation and more reliable extrapolation beyond the observed training range. We provide theoretical insights into the resulting conditional transport and show how mixture-conditioned bases enhance robustness under shift. Empirically, SP-FM is effective across heterogeneous domains, including predicting responses to unseen perturbations in single-cell transcriptomics and modeling treatment effects in high-content microscopy--based drug screening. Overall, SP-FM provides a simple yet effective plug-in strategy for improving conditional generative modeling and OOD generalization across diverse domains.
Problem

Research questions and friction points this paper is trying to address.

out-of-distribution generalization
conditional generative modeling
distribution shift
descriptor-controlled generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture-conditioned flow matching
Out-of-distribution generalization
Descriptor-controlled generation
Learnable base distribution
Shortest-path flow matching
🔎 Similar Papers
No similar papers found.
A
Andrea Rubbi
Wellcome Sanger Institute, Cambridge, United Kingdom
A
Amir Akbarnejad
Wellcome Sanger Institute, Cambridge, United Kingdom
M
Mohammad V. Sanian
Wellcome Sanger Institute, Cambridge, United Kingdom
A
Aryan Yazdan Parast
School of Computing and Information Systems, The University of Melbourne, Melbourne, Australia
H
Hesam Asadollahzadeh
Wellcome Sanger Institute, Cambridge, United Kingdom
A
Arian Amani
Wellcome Sanger Institute, Cambridge, United Kingdom
Naveed Akhtar
Naveed Akhtar
The University of Melbourne
Computer VisionPattern RecognitionRoboticsRemote Sensing
S
Sarah Cooper
Wellcome Sanger Institute, Cambridge, United Kingdom
Andrew Bassett
Andrew Bassett
Wellcome Trust Sanger Institute, Cambridge
Genome EditingCRISPRiPSCChromatinnon-coding RNA
P
Pietro Lio
Department of Computer Science and Technology, University of Cambridge, Cambridge, United Kingdom
L
L. Paavolainen
Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
Sattar Vakili
Sattar Vakili
MediaTek Research
Machine Learning
M
Mohammad Lotfollahi
Wellcome Sanger Institute, Cambridge, United Kingdom