Reviving ConvNeXt for Efficient Convolutional Diffusion Models

📅 2026-03-10

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

This work addresses the prevailing overreliance on Transformers in diffusion models, which overlooks the advantages of convolutional networks in local modeling, parameter efficiency, and hardware compatibility. The authors propose the Fully Convolutional Diffusion Model (FCDM), which successfully integrates the modern ConvNeXt architecture into high-resolution conditional diffusion generation for the first time. Leveraging a fully convolutional design and an efficient training strategy, FCDM achieves generation quality comparable to DiT-XL/2 at both 256×256 and 512×512 resolutions, while requiring only about 50% of the FLOPs and 1/7 to 1/7.5 of the training steps. Moreover, FCDM can be efficiently trained on a 4-GPU system, substantially improving computational efficiency.

Technology Category

Application Category

📝 Abstract

Recent diffusion models increasingly favor Transformer backbones, motivated by the remarkable scalability of fully attentional architectures. Yet the locality bias, parameter efficiency, and hardware friendliness--the attributes that established ConvNets as the efficient vision backbone--have seen limited exploration in modern generative modeling. Here we introduce the fully convolutional diffusion model (FCDM), a model having a backbone similar to ConvNeXt, but designed for conditional diffusion modeling. We find that using only 50% of the FLOPs of DiT-XL/2, FCDM-XL achieves competitive performance with 7$\times$ and 7.5$\times$ fewer training steps at 256$\times$256 and 512$\times$512 resolutions, respectively. Remarkably, FCDM-XL can be trained on a 4-GPU system, highlighting the exceptional training efficiency of our architecture. Our results demonstrate that modern convolutional designs provide a competitive and highly efficient alternative for scaling diffusion models, reviving ConvNeXt as a simple yet powerful building block for efficient generative modeling.

Problem

Research questions and friction points this paper is trying to address.

diffusion models

ConvNeXt

convolutional networks

efficient generative modeling

model scalability

Innovation

Methods, ideas, or system contributions that make the work stand out.

ConvNeXt

convolutional diffusion model

parameter efficiency