Few-step Flow for 3D Generation via Marginal-Data Transport Distillation

📅 2025-09-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing flow-based 3D generative models require dozens of sampling steps, suffering from low inference efficiency; meanwhile, few-step distillation techniques developed for 2D (e.g., consistency models) have yet to be effectively extended to 3D. Method: We propose MDT-dist, the first framework to introduce marginal–data transport principles into 3D generative distillation. It jointly optimizes Velocity Matching and Velocity Distillation to learn an equivalent flow field under non-integrable velocity fields, augmented with probabilistic density distillation for faithful knowledge transfer from the teacher (TRELLIS) to the student. Contribution/Results: On the TRELLIS benchmark, MDT-dist reduces sampling steps from 25 to just 1–2 per stage, achieving a 9.0× speedup with only 0.68 seconds latency, while preserving state-of-the-art visual and geometric fidelity—significantly outperforming existing consistency-model-based approaches.

Technology Category

Application Category

📝 Abstract
Flow-based 3D generation models typically require dozens of sampling steps during inference. Though few-step distillation methods, particularly Consistency Models (CMs), have achieved substantial advancements in accelerating 2D diffusion models, they remain under-explored for more complex 3D generation tasks. In this study, we propose a novel framework, MDT-dist, for few-step 3D flow distillation. Our approach is built upon a primary objective: distilling the pretrained model to learn the Marginal-Data Transport. Directly learning this objective needs to integrate the velocity fields, while this integral is intractable to be implemented. Therefore, we propose two optimizable objectives, Velocity Matching (VM) and Velocity Distillation (VD), to equivalently convert the optimization target from the transport level to the velocity and the distribution level respectively. Velocity Matching (VM) learns to stably match the velocity fields between the student and the teacher, but inevitably provides biased gradient estimates. Velocity Distillation (VD) further enhances the optimization process by leveraging the learned velocity fields to perform probability density distillation. When evaluated on the pioneer 3D generation framework TRELLIS, our method reduces sampling steps of each flow transformer from 25 to 1 or 2, achieving 0.68s (1 step x 2) and 0.94s (2 steps x 2) latency with 9.0x and 6.5x speedup on A800, while preserving high visual and geometric fidelity. Extensive experiments demonstrate that our method significantly outperforms existing CM distillation methods, and enables TRELLIS to achieve superior performance in few-step 3D generation.
Problem

Research questions and friction points this paper is trying to address.

Accelerating 3D flow generation models with fewer sampling steps
Distilling pretrained models via Marginal-Data Transport learning
Overcoming biased gradients in velocity matching for 3D generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Distills pretrained model via Marginal-Data Transport
Uses Velocity Matching and Velocity Distillation objectives
Reduces sampling steps from 25 to 1-2 steps
🔎 Similar Papers
2024-03-18European Conference on Computer VisionCitations: 70