Generalization and Memorization in Rectified Flow

📅 2026-03-12

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

This work presents the first systematic investigation into the memorization behavior and associated privacy risks of Rectified Flow models. We propose a non-trivial membership inference attack (MIA) that leverages a complexity-calibrated test statistic, revealing that training data leakage is most pronounced at the midpoint of the flow integration trajectory. To mitigate this vulnerability, we introduce a symmetric exponential time-step sampling strategy that effectively suppresses memorization while preserving sample quality. Experimental results demonstrate that our approach significantly enhances privacy: the proposed MIA achieves up to a 15% increase in AUC and a 45% improvement in TPR@1%FPR, thereby both exposing and alleviating the inherent privacy risks in Rectified Flow models.

Technology Category

Application Category

📝 Abstract

Generative models based on the Flow Matching objective, particularly Rectified Flow, have emerged as a dominant paradigm for efficient, high-fidelity image synthesis. However, while existing research heavily prioritizes generation quality and architectural scaling, the underlying dynamics of how RF models memorize training data remain largely underexplored. In this paper, we systematically investigate the memorization behaviors of RF through the test statistics of Membership Inference Attacks (MIA). We progressively formulate three test statistics, culminating in a complexity-calibrated metric ($T_\text{mc\_cal}$) that successfully decouples intrinsic image spatial complexity from genuine memorization signals. This calibration yields a significant performance surge -- boosting attack AUC by up to 15\% and the privacy-critical TPR@1\%FPR metric by up to 45\% -- establishing the first non-trivial MIA specifically tailored for RF. Leveraging these refined metrics, we uncover a distinct temporal pattern: under standard uniform temporal training, a model's susceptibility to MIA strictly peaks at the integration midpoint, a phenomenon we justify via the network's forced deviation from linear approximations. Finally, we demonstrate that substituting uniform timestep sampling with a Symmetric Exponential (U-shaped) distribution effectively minimizes exposure to vulnerable intermediate timesteps. Extensive evaluations across three datasets confirm that this temporal regularization suppresses memorization while preserving generative fidelity.

Problem

Research questions and friction points this paper is trying to address.

Rectified Flow

Memorization

Membership Inference Attack

Generalization

Temporal Training Dynamics

Innovation

Methods, ideas, or system contributions that make the work stand out.

Rectified Flow

Membership Inference Attack

Memorization