Towards Better & Faster Autoregressive Image Generation: From the Perspective of Entropy

📅 2025-10-10

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

Autoregressive image generation suffers from low information density and spatially uneven distribution of image tokens, limiting both generation quality and decoding speed. To address this, we propose an entropy-driven efficient decoding framework. Our key contributions are: (1) a spatial entropy-guided dynamic temperature scheduling mechanism that balances token diversity and structural consistency; (2) an entropy-aware acceptance criterion for speculative decoding, significantly improving token acceptance reliability; and (3) a lightweight design compatible with both mask-based and scale-wise autoregressive architectures. Extensive experiments across multiple benchmarks and model variants demonstrate that our method preserves near-lossless generation quality while reducing inference cost to 85% of conventional acceleration approaches—outperforming existing decoding strategies in both efficiency and fidelity.

Technology Category

Application Category

📝 Abstract

In this work, we first revisit the sampling issues in current autoregressive (AR) image generation models and identify that image tokens, unlike text tokens, exhibit lower information density and non-uniform spatial distribution. Accordingly, we present an entropy-informed decoding strategy that facilitates higher autoregressive generation quality with faster synthesis speed. Specifically, the proposed method introduces two main innovations: 1) dynamic temperature control guided by spatial entropy of token distributions, enhancing the balance between content diversity, alignment accuracy, and structural coherence in both mask-based and scale-wise models, without extra computational overhead, and 2) entropy-aware acceptance rules in speculative decoding, achieving near-lossless generation at about 85% of the inference cost of conventional acceleration methods. Extensive experiments across multiple benchmarks using diverse AR image generation models demonstrate the effectiveness and generalizability of our approach in enhancing both generation quality and sampling speed.

Problem

Research questions and friction points this paper is trying to address.

Improving autoregressive image generation quality and speed

Addressing low information density in image tokens

Enhancing sampling efficiency with entropy-guided decoding

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic temperature control using spatial entropy guidance

Entropy-aware acceptance rules for speculative decoding

Balancing diversity and coherence without extra computation

🔎 Similar Papers

Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding