Towards Defining an Efficient and Expandable File Format for AI-Generated Contents

📅 2024-10-13
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the low pixel-level compression efficiency and poor cross-model/platform compatibility in AI-generated image (AIGC) storage and transmission, this paper proposes AIGIF—a novel file format. Methodologically, AIGIF abandons conventional pixel-based compression, instead modeling and efficiently encoding generative syntax—including text prompts, model architectures, and sampling configurations—as structured metadata. It introduces a composable bitstream architecture and an extensible metadata framework to jointly represent platform, model, and data configuration information. Experimental results demonstrate that AIGIF achieves up to 10,000× compression ratios while preserving high-fidelity image reconstruction. Moreover, it natively supports interoperability across diverse generative models and heterogeneous platforms, and its syntax-driven design ensures forward compatibility with future generators through extensible metadata schemas.

Technology Category

Application Category

📝 Abstract
Recently, AI-generated content (AIGC) has gained significant traction due to its powerful creation capability. However, the storage and transmission of large amounts of high-quality AIGC images inevitably pose new challenges for recent file formats. To overcome this, we define a new file format for AIGC images, named AIGIF, enabling ultra-low bitrate coding of AIGC images. Unlike compressing AIGC images intuitively with pixel-wise space as existing file formats, AIGIF instead compresses the generation syntax. This raises a crucial question: Which generation syntax elements, e.g., text prompt, device configuration, etc, are necessary for compression/transmission? To answer this question, we systematically investigate the effects of three essential factors: platform, generative model, and data configuration. We experimentally find that a well-designed composable bitstream structure incorporating the above three factors can achieve an impressive compression ratio of even up to 1/10,000 while still ensuring high fidelity. We also introduce an expandable syntax in AIGIF to support the extension of the most advanced generation models to be developed in the future.
Problem

Research questions and friction points this paper is trying to address.

Defining efficient file format
Compressing AI-generated images
Investigating generation syntax elements
Innovation

Methods, ideas, or system contributions that make the work stand out.

AIGIF file format
compression of generation syntax
composable bitstream structure
🔎 Similar Papers
No similar papers found.
Yixin Gao
Yixin Gao
University of Science and Technology of China
Computer VisionLearned image compression
Runsen Feng
Runsen Feng
University of Science and Technology of China
Data Compression
X
Xin Li
University of Science and Technology of China
W
Weiping Li
University of Science and Technology of China
Z
Zhibo Chen
University of Science and Technology of China