INRFlow: Flow Matching for INRs in Ambient Space

📅 2024-12-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing flow-matching generative models rely on a two-stage training paradigm—first pretraining modality-specific compressors, then modeling in latent space—severely hindering unified cross-modal representation learning. To address this, we propose INRFlow, the first flow-matching framework trained end-to-end directly in the ambient (raw coordinate) space without any pre-trained compressor, enabling unified generation across images, 3D point clouds, and protein structures. Our key contributions are: (1) a domain-agnostic flow-matching formulation; (2) a conditionally independent per-point training objective that eliminates reliance on two-stage architectures; and (3) a synergistic integration of implicit neural representations (INRs) with Transformers for efficient continuous-space modeling. Experiments demonstrate that INRFlow consistently outperforms state-of-the-art methods across all three modalities, validating its strong cross-modal generalization and high-fidelity generation capability.

Technology Category

Application Category

📝 Abstract
Flow matching models have emerged as a powerful method for generative modeling on domains like images or videos, and even on irregular or unstructured data like 3D point clouds or even protein structures. These models are commonly trained in two stages: first, a data compressor is trained, and in a subsequent training stage a flow matching generative model is trained in the latent space of the data compressor. This two-stage paradigm sets obstacles for unifying models across data domains, as hand-crafted compressors architectures are used for different data modalities. To this end, we introduce INRFlow, a domain-agnostic approach to learn flow matching transformers directly in ambient space. Drawing inspiration from INRs, we introduce a conditionally independent point-wise training objective that enables INRFlow to make predictions continuously in coordinate space. Our empirical results demonstrate that INRFlow effectively handles different data modalities such as images, 3D point clouds and protein structure data, achieving strong performance in different domains and outperforming comparable approaches. INRFlow is a promising step towards domain-agnostic flow matching generative models that can be trivially adopted in different data domains.
Problem

Research questions and friction points this paper is trying to address.

Eliminates need for domain-specific data compressors in flow matching
Enables continuous predictions in coordinate space across modalities
Unifies generative modeling for images, point clouds, and proteins
Innovation

Methods, ideas, or system contributions that make the work stand out.

Flow matching transformers in ambient space
Conditionally independent point-wise training objective
Handles images, 3D point clouds, protein data
🔎 Similar Papers
No similar papers found.