🤖 AI Summary
This paper addresses privacy-preserving generative modeling under distributed data settings. We propose Federated Flow Matching (FFM), a decentralized framework that performs flow matching without centralized data aggregation. Methodologically, FFM innovatively integrates locally optimal transport with a semi-dual formulation of global optimal transport; clients collaboratively align flow trajectories by sharing scalar potential functions, thereby enhancing flow linearity and generation quality. Furthermore, FFM embeds flow matching into the federated learning paradigm, ensuring end-to-end privacy preservation. Extensive experiments on synthetic and image benchmarks demonstrate that FFM achieves generation performance competitive with centralized training baselines, improves flow straightness by up to 23.6%, and attains superior inference efficiency compared to existing federated generative methods—effectively balancing privacy guarantees, generation fidelity, and computational efficiency.
📝 Abstract
Data today is decentralized, generated and stored across devices and institutions where privacy, ownership, and regulation prevent centralization. This motivates the need to train generative models directly from distributed data locally without central aggregation. In this paper, we introduce Federated Flow Matching (FFM), a framework for training flow matching models under privacy constraints. Specifically, we first examine FFM-vanilla, where each client trains locally with independent source and target couplings, preserving privacy but yielding curved flows that slow inference. We then develop FFM-LOT, which employs local optimal transport couplings to improve straightness within each client but lacks global consistency under heterogeneous data. Finally, we propose FFM-GOT, a federated strategy based on the semi-dual formulation of optimal transport, where a shared global potential function coordinates couplings across clients. Experiments on synthetic and image datasets show that FFM enables privacy-preserving training while enhancing both the flow straightness and sample quality in federated settings, with performance comparable to the centralized baseline.