Towards Defining an Efficient and Expandable File Format for AI-Generated Contents

📅 2024-10-13

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

151K/year

🤖 AI Summary

To address the low pixel-level compression efficiency and poor cross-model/platform compatibility in AI-generated image (AIGC) storage and transmission, this paper proposes AIGIF—a novel file format. Methodologically, AIGIF abandons conventional pixel-based compression, instead modeling and efficiently encoding generative syntax—including text prompts, model architectures, and sampling configurations—as structured metadata. It introduces a composable bitstream architecture and an extensible metadata framework to jointly represent platform, model, and data configuration information. Experimental results demonstrate that AIGIF achieves up to 10,000× compression ratios while preserving high-fidelity image reconstruction. Moreover, it natively supports interoperability across diverse generative models and heterogeneous platforms, and its syntax-driven design ensures forward compatibility with future generators through extensible metadata schemas.

Technology Category

Application Category

📝 Abstract

Recently, AI-generated content (AIGC) has gained significant traction due to its powerful creation capability. However, the storage and transmission of large amounts of high-quality AIGC images inevitably pose new challenges for recent file formats. To overcome this, we define a new file format for AIGC images, named AIGIF, enabling ultra-low bitrate coding of AIGC images. Unlike compressing AIGC images intuitively with pixel-wise space as existing file formats, AIGIF instead compresses the generation syntax. This raises a crucial question: Which generation syntax elements, e.g., text prompt, device configuration, etc, are necessary for compression/transmission? To answer this question, we systematically investigate the effects of three essential factors: platform, generative model, and data configuration. We experimentally find that a well-designed composable bitstream structure incorporating the above three factors can achieve an impressive compression ratio of even up to 1/10,000 while still ensuring high fidelity. We also introduce an expandable syntax in AIGIF to support the extension of the most advanced generation models to be developed in the future.

Problem

Research questions and friction points this paper is trying to address.

Defining efficient file format

Compressing AI-generated images

Investigating generation syntax elements

Innovation

Methods, ideas, or system contributions that make the work stand out.

AIGIF file format

compression of generation syntax

composable bitstream structure

🔎 Similar Papers

On the Challenges and Opportunities in Generative AI