Reviving ConvNeXt for Efficient Convolutional Diffusion Models

📅 2026-03-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the prevailing overreliance on Transformers in diffusion models, which overlooks the advantages of convolutional networks in local modeling, parameter efficiency, and hardware compatibility. The authors propose the Fully Convolutional Diffusion Model (FCDM), which successfully integrates the modern ConvNeXt architecture into high-resolution conditional diffusion generation for the first time. Leveraging a fully convolutional design and an efficient training strategy, FCDM achieves generation quality comparable to DiT-XL/2 at both 256×256 and 512×512 resolutions, while requiring only about 50% of the FLOPs and 1/7 to 1/7.5 of the training steps. Moreover, FCDM can be efficiently trained on a 4-GPU system, substantially improving computational efficiency.

Technology Category

Application Category

📝 Abstract
Recent diffusion models increasingly favor Transformer backbones, motivated by the remarkable scalability of fully attentional architectures. Yet the locality bias, parameter efficiency, and hardware friendliness--the attributes that established ConvNets as the efficient vision backbone--have seen limited exploration in modern generative modeling. Here we introduce the fully convolutional diffusion model (FCDM), a model having a backbone similar to ConvNeXt, but designed for conditional diffusion modeling. We find that using only 50% of the FLOPs of DiT-XL/2, FCDM-XL achieves competitive performance with 7$\times$ and 7.5$\times$ fewer training steps at 256$\times$256 and 512$\times$512 resolutions, respectively. Remarkably, FCDM-XL can be trained on a 4-GPU system, highlighting the exceptional training efficiency of our architecture. Our results demonstrate that modern convolutional designs provide a competitive and highly efficient alternative for scaling diffusion models, reviving ConvNeXt as a simple yet powerful building block for efficient generative modeling.
Problem

Research questions and friction points this paper is trying to address.

diffusion models
ConvNeXt
convolutional networks
efficient generative modeling
model scalability
Innovation

Methods, ideas, or system contributions that make the work stand out.

ConvNeXt
convolutional diffusion model
parameter efficiency
training efficiency
FCDM
🔎 Similar Papers
No similar papers found.