🤖 AI Summary
To address the high latency and computational overhead of diffusion-based text-to-image generation in edge computing environments, this paper proposes a semantic-aware hybrid generative acceleration framework. The framework synergistically integrates text-only generation with image-guided generation, initializing the denoising process using semantically similar cached reference images to substantially reduce the number of denoising steps. It introduces three key innovations: (1) a semantic classification-based caching mechanism, (2) a dynamic request scheduling algorithm, and (3) a correlation-aware proactive cache maintenance strategy—collectively ensuring semantic alignment and cache efficiency. Extensive experiments on a real-world edge system demonstrate that the proposed method reduces end-to-end generation latency by 41% and computational cost by 48%, while preserving image fidelity comparable to state-of-the-art methods.
📝 Abstract
Text-to-image generation employing diffusion models has attained significant popularity due to its capability to produce high-quality images that adhere to textual prompts. However, the integration of diffusion models faces critical challenges into resource-constrained mobile and edge environments because it requires multiple denoising steps from the original random noise. A practical way to speed up denoising is to initialize the process with a noised reference image that is similar to the target, since both images share similar layouts, structures, and details, allowing for fewer denoising steps. Based on this idea, we present CacheGenius, a hybrid image generation system in edge computing that accelerates generation by combining text-toimage and image-to-image workflows. It generates images from user text prompts using cached reference images. CacheGenius introduces a semantic-aware classified storage scheme and a request-scheduling algorithm that ensures semantic alignment between references and targets. To ensure sustained performance, it employs a cache maintenance policy that proactively evicts obsolete entries via correlation analysis. Evaluated in a distributed edge computing system, CacheGenius reduces generation latency by 41% and computational costs by 48% relative to baselines, while maintaining competitive evaluation metrics.