LatentBox: An Efficient Latent-First Storage System for AI-Generated Images

📅 2026-05-19

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

The surge in AI-generated images has exposed critical capacity and bandwidth bottlenecks in traditional pixel-based storage systems, leading to substantial redundancy. This work proposes the first latent-variable-first storage system specifically designed for large-scale access patterns of AI-generated imagery. By persisting compressed, model-native latent representations instead of raw pixels and reconstructing images on-demand via GPU inference, the system trades inexpensive computation for significant storage savings. It further incorporates dynamic cache partitioning and an adaptive scheduling mechanism to optimize access latency. Evaluated on a real-world production trace comprising two billion requests, the system reduces persistent storage overhead by 78.7% while maintaining comparable or even lower average and tail latencies, thereby challenging conventional image storage paradigms.

📝 Abstract

The explosive growth of AI-generated images has created a sustainability challenge for storage infrastructure. Platforms like Midjourney and Adobe Firefly already host billions of generative images, yet conventional object stores persist them as blobs with full-resolution pixels, consuming huge amounts of storage capacity and bandwidth. Unlike natural photos, however, AI-generated images can be deterministically reconstructed from compact, model-native latent tensors, making persistent image storage fundamentally redundant. This paper presents LatentBox, a latent-first storage system for AI-generated images. LatentBox treats compressed latents as durable storage objects and uses on-demand GPU reconstruction on the read path to trade inexpensive compute for large persistent storage savings. Our design is guided by the first large-scale analysis of AI-generated image access we are aware of, based on a 35-month, 2-billion-request production trace from a major generative-content platform. Motivated by the trace analysis, LatentBox keeps frequently accessed images in decoded pixel format for fast hits, stores less-active objects as compressed latents to expand effective cache capacity, and continuously adjusts the splits between the image and latent cache to optimize user-perceived access latency.We build a LatentBox prototype and evaluate it with the production trace. LatentBox reduces persistent storage by 78.7% with competitive or even lower mean and tail latency over a pure image-based storage.

Problem

Research questions and friction points this paper is trying to address.

AI-generated images

storage efficiency

latent representation

object storage

sustainability

Innovation

Methods, ideas, or system contributions that make the work stand out.

latent-first storage

AI-generated images

on-demand reconstruction