Flowing Backwards: Improving Normalizing Flows via Reverse Representation Alignment

📅 2025-11-27

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

Normalizing flows (NFs) suffer from limited semantic representational capacity due to reliance on log-likelihood optimization, resulting in suboptimal generation quality. To address this, we propose Reverse Representation Alignment (RRA): leveraging the invertibility of NFs, RRA maps latent variables backward into the semantic embedding space of a frozen, pre-trained vision foundation model (e.g., CLIP or DINO), enabling unsupervised, model-free semantic alignment during generation. Furthermore, we introduce a training-free, test-time classification optimization algorithm that dynamically refines semantic consistency of generated samples. Unlike conventional forward regularization paradigms, RRA operates in reverse semantic space, significantly enhancing both semantic expressiveness and fidelity. Our method establishes new state-of-the-art performance for NFs on ImageNet at 64×64 and 256×256 resolutions, accelerates training by 3.3×, and simultaneously improves FID scores and classification accuracy.

Technology Category

Application Category

📝 Abstract

Normalizing Flows (NFs) are a class of generative models distinguished by a mathematically invertible architecture, where the forward pass transforms data into a latent space for density estimation, and the reverse pass generates new samples from this space. This characteristic creates an intrinsic synergy between representation learning and data generation. However, the generative quality of standard NFs is limited by poor semantic representations from log-likelihood optimization. To remedy this, we propose a novel alignment strategy that creatively leverages the invertibility of NFs: instead of regularizing the forward pass, we align the intermediate features of the generative (reverse) pass with representations from a powerful vision foundation model, demonstrating superior effectiveness over naive alignment. We also introduce a novel training-free, test-time optimization algorithm for classification, which provides a more intrinsic evaluation of the NF's embedded semantic knowledge. Comprehensive experiments demonstrate that our approach accelerates the training of NFs by over 3.3$ imes$, while simultaneously delivering significant improvements in both generative quality and classification accuracy. New state-of-the-art results for NFs are established on ImageNet 64$ imes$64 and 256$ imes$256. Our code is available at https://github.com/MCG-NJU/FlowBack.

Problem

Research questions and friction points this paper is trying to address.

Improves generative quality of Normalizing Flows via reverse feature alignment.

Enhances semantic representation in flows using vision foundation models.

Accelerates training and boosts performance in generation and classification.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Aligns reverse pass features with vision foundation model representations

Introduces training-free test-time optimization for classification evaluation

Accelerates NF training over 3.3x while improving quality and accuracy

🔎 Similar Papers

On the Universality of Volume-Preserving and Coupling-Based Normalizing Flows