🤖 AI Summary
Normalized flow generative models suffer from interpolation paths deviating from the data manifold, primarily due to norm drift induced by Gaussian base distributions in latent space. To address this, we propose a norm-constrained base distribution reconstruction framework—introducing Dirichlet and von Mises–Fisher distributions into normalized flows for the first time. These distributions explicitly constrain latent variables to the unit simplex or unit hypersphere, respectively, ensuring geometrically consistent interpolation trajectories. Our method requires no architectural modifications to the flow network and provides an interpretable, unambiguous interpolation criterion, effectively overcoming interpolation distortion inherent to the Gaussian assumption. Experiments demonstrate consistent improvements over baselines across all major evaluation metrics: bits/dim, Fréchet Inception Distance (FID), and Kernel Inception Distance (KID). Interpolation quality is significantly enhanced while strictly preserving original generation performance.
📝 Abstract
Generative models based on normalizing flows are very successful in modeling complex data distributions using simpler ones. However, straightforward linear interpolations show unexpected side effects, as interpolation paths lie outside the area where samples are observed. This is caused by the standard choice of Gaussian base distributions and can be seen in the norms of the interpolated samples. This observation suggests that correcting the norm should generally result in better interpolations, but it is not clear how to correct the norm in an unambiguous way. In this paper, we solve this issue by enforcing a fixed norm and, hence, change the base distribution, to allow for a principled way of interpolation. Specifically, we use the Dirichlet and von Mises-Fisher base distributions. Our experimental results show superior performance in terms of bits per dimension, Fr'echet Inception Distance (FID), and Kernel Inception Distance (KID) scores for interpolation, while maintaining the same generative performance.