LegoDiffusion: Micro-Serving Text-to-Image Diffusion Workflows

📅 2026-04-09

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

Existing text-to-image diffusion workflows are often treated as black boxes, leading to non-shareable models, coarse-grained resource scheduling, and opaque internal data flows. This work proposes a microservice-based diffusion workflow architecture that decouples the pipeline into loosely coupled model execution nodes, enabling for the first time model-level elastic scaling, cross-request model sharing, and adaptive model parallelism. A distributed scheduling system with explicit model inference management supports dynamic orchestration, independent scaling, and fine-grained resource sharing. Experimental results demonstrate that the proposed approach achieves up to threefold higher request throughput compared to existing systems and can handle burst traffic spikes as high as eight times the baseline load.

Technology Category

Application Category

📝 Abstract

Text-to-image generation executes a diffusion workflow comprising multiple models centered on a base diffusion model. Existing serving systems treat each workflow as an opaque monolith, provisioning, placing, and scaling all constituent models together, which obscures internal dataflow, prevents model sharing, and enforces coarse-grained resource management. In this paper, we make a case for micro-serving diffusion workflows with LegoDiffusion, a system that decomposes a workflow into loosely coupled model-execution nodes that can be independently managed and scheduled. By explicitly managing individual model inference, LegoDiffusion unlocks cluster-scale optimizations, including per-model scaling, model sharing, and adaptive model parallelism. Collectively, LegoDiffusion outperforms existing diffusion workflow serving systems, sustaining up to 3x higher request rates and tolerating up to 8x higher burst traffic.

Problem

Research questions and friction points this paper is trying to address.

text-to-image generation

diffusion workflow

model serving

resource management

model sharing

Innovation

Methods, ideas, or system contributions that make the work stand out.

micro-serving

diffusion workflow

model decomposition