Generalization and Memorization in Rectified Flow

📅 2026-03-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work presents the first systematic investigation into the memorization behavior and associated privacy risks of Rectified Flow models. We propose a non-trivial membership inference attack (MIA) that leverages a complexity-calibrated test statistic, revealing that training data leakage is most pronounced at the midpoint of the flow integration trajectory. To mitigate this vulnerability, we introduce a symmetric exponential time-step sampling strategy that effectively suppresses memorization while preserving sample quality. Experimental results demonstrate that our approach significantly enhances privacy: the proposed MIA achieves up to a 15% increase in AUC and a 45% improvement in TPR@1%FPR, thereby both exposing and alleviating the inherent privacy risks in Rectified Flow models.

Technology Category

Application Category

📝 Abstract
Generative models based on the Flow Matching objective, particularly Rectified Flow, have emerged as a dominant paradigm for efficient, high-fidelity image synthesis. However, while existing research heavily prioritizes generation quality and architectural scaling, the underlying dynamics of how RF models memorize training data remain largely underexplored. In this paper, we systematically investigate the memorization behaviors of RF through the test statistics of Membership Inference Attacks (MIA). We progressively formulate three test statistics, culminating in a complexity-calibrated metric ($T_\text{mc\_cal}$) that successfully decouples intrinsic image spatial complexity from genuine memorization signals. This calibration yields a significant performance surge -- boosting attack AUC by up to 15\% and the privacy-critical TPR@1\%FPR metric by up to 45\% -- establishing the first non-trivial MIA specifically tailored for RF. Leveraging these refined metrics, we uncover a distinct temporal pattern: under standard uniform temporal training, a model's susceptibility to MIA strictly peaks at the integration midpoint, a phenomenon we justify via the network's forced deviation from linear approximations. Finally, we demonstrate that substituting uniform timestep sampling with a Symmetric Exponential (U-shaped) distribution effectively minimizes exposure to vulnerable intermediate timesteps. Extensive evaluations across three datasets confirm that this temporal regularization suppresses memorization while preserving generative fidelity.
Problem

Research questions and friction points this paper is trying to address.

Rectified Flow
Memorization
Membership Inference Attack
Generalization
Temporal Training Dynamics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Rectified Flow
Membership Inference Attack
Memorization
Temporal Sampling
Privacy
🔎 Similar Papers
No similar papers found.
M
Mingxing Rao
Vanderbilt University, Nashville TN 37235, USA
Daniel Moyer
Daniel Moyer
Vanderbilt University
Medical ImagingMachine Learning