Autoregressive Image Generation with Masked Bit Modeling

📅 2026-02-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Discrete visual generative models have long underperformed continuous counterparts due to limitations in codebook size and insufficient compression. This work proposes BAR (masked Bit AutoRegressive), a scalable autoregressive framework that decomposes discrete tokens into binary bit sequences and employs bit-wise masked modeling to enable efficient training and sampling with arbitrarily large codebooks. BAR overcomes the longstanding scalability and computational bottlenecks of discrete generative models, achieving a state-of-the-art 0.99 gFID on ImageNet-256—surpassing both leading continuous and discrete approaches—while significantly accelerating convergence and reducing sampling cost.

Technology Category

Application Category

📝 Abstract
This paper challenges the dominance of continuous pipelines in visual generation. We systematically investigate the performance gap between discrete and continuous methods. Contrary to the belief that discrete tokenizers are intrinsically inferior, we demonstrate that the disparity arises primarily from the total number of bits allocated in the latent space (i.e., the compression ratio). We show that scaling up the codebook size effectively bridges this gap, allowing discrete tokenizers to match or surpass their continuous counterparts. However, existing discrete generation methods struggle to capitalize on this insight, suffering from performance degradation or prohibitive training costs with scaled codebook. To address this, we propose masked Bit AutoRegressive modeling (BAR), a scalable framework that supports arbitrary codebook sizes. By equipping an autoregressive transformer with a masked bit modeling head, BAR predicts discrete tokens through progressively generating their constituent bits. BAR achieves a new state-of-the-art gFID of 0.99 on ImageNet-256, outperforming leading methods across both continuous and discrete paradigms, while significantly reducing sampling costs and converging faster than prior continuous approaches. Project page is available at https://bar-gen.github.io/
Problem

Research questions and friction points this paper is trying to address.

discrete image generation
codebook scaling
autoregressive modeling
masked bit modeling
visual generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Masked Bit Modeling
Autoregressive Image Generation
Discrete Tokenizer
Scalable Codebook
BAR
🔎 Similar Papers
No similar papers found.