🤖 AI Summary
Speculative Jacobi Decoding is limited in high-entropy visual regions due to low draft token acceptance rates. To address this, this work proposes an active drafting strategy that enhances local acceptance rates in such regions and introduces an adaptive continuation mechanism that validates subsequent tokens after the first rejection, thereby avoiding full resampling. This approach significantly increases the average number of accepted tokens per decoding step. By synergistically combining active drafting with adaptive continuation, the method achieves highly efficient, lossless acceleration without altering the target distribution. Experiments on standard text-to-image generation benchmarks demonstrate a 3.8× speedup in inference while preserving image quality.
📝 Abstract
Speculative Jacobi Decoding (SJD) offers a draft-model-free approach to accelerate autoregressive text-to-image synthesis. However, the high-entropy nature of visual generation yields low draft-token acceptance rates in complex regions, creating a bottleneck that severely limits overall throughput. To overcome this, we introduce SJD-PAC, an enhanced SJD framework. First, SJD-PAC employs a proactive drafting strategy to improve local acceptance rates in these challenging high-entropy regions. Second, we introduce an adaptive continuation mechanism that sustains sequence validation after an initial rejection, bypassing the need for full resampling. Working in tandem, these optimizations significantly increase the average acceptance length per step, boosting inference speed while strictly preserving the target distribution. Experiments on standard text-to-image benchmarks demonstrate that SJD-PAC achieves a $3.8\times$ speedup with lossless image quality.