CIAR: Interval-based Collaborative Decoding for Image Generation Acceleration

📅 2026-03-26

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the computational inefficiency of autoregressive image generation models, whose sequential decoding hinders efficient on-device deployment. To overcome this limitation, the authors propose CIAR, a cloud-edge collaborative framework that introduces continuous probability intervals—replacing discrete token sets—to enable on-device quantification and skipping of visually redundant regions. Through interval-augmented decoding and distribution alignment training, CIAR achieves significant acceleration in generation while preserving image fidelity and semantic consistency. Compared to existing approaches, it reduces inference latency by 2.18× and decreases cloud requests by 70%. Key innovations include the continuous probability interval representation, a token-level redundancy-aware mechanism on the device, and a novel collaborative inference architecture.

Technology Category

Application Category

📝 Abstract

Auto-regressive (AR) models have recently made notable progress in image generation, achieving performance comparable to diffusion-based approaches. However, their computational intensity and sequential nature impede on-device deployment, causing disruptive latency. We address this via a cloud-device collaboration framework \textbf{CIAR}, which utilizes on-device self-verification to handle two key properties of visual synthesis: \textit{the vast token vocabulary} required for high-fidelity images and \textit{inherent spatial redundancy} which leads to extreme predictability in homogeneous regions, while object boundaries exhibit high uncertainty. Uniform verification wastes resources on such redundant tokens. Our solution centers on an on-device token uncertainty quantifier, which adopts continuous probability intervals to accelerate processing and make it feasible for large visual vocabularies instead of conventional discrete solution sets. Additionally, we incorporate a Interval-enhanced decoding module to further speed up decoding while maintaining visual fidelity and semantic consistency via a distribution alignment training strategy. Extensive experiments demonstrate that CIAR achieves a 2.18x speed-up and reduces cloud requests by 70\%, while preserving image quality compared to existing methods.

Problem

Research questions and friction points this paper is trying to address.

auto-regressive image generation

on-device deployment

computational latency

token redundancy

visual synthesis

Innovation

Methods, ideas, or system contributions that make the work stand out.

interval-based decoding

collaborative inference

token uncertainty quantification

visual vocabulary acceleration

distribution alignment

🔎 Similar Papers

No similar papers found.

Authors to Follow