A Style is Worth One Code: Unlocking Code-to-Style Image Generation with Discrete Style Space

📅 2025-11-13

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This paper introduces the novel task of “code-to-style image generation,” aiming to synthesize high-fidelity, visually coherent, and controllable stylized images from a single numeric code—without relying on textual prompts, reference images, or model fine-tuning. To this end, we propose CoTyle, the first open-source framework for numeric style control: it employs a discrete style codebook and an autoregressive style generator to map numeric codes into structured, semantically grounded style representations; these are then injected as conditional signals into a text-to-image diffusion model via a dedicated style-conditioning mechanism. Trained on large-scale stylistic data, the codebook ensures both high style diversity and strong reproducibility. Extensive experiments demonstrate that CoTyle achieves state-of-the-art performance in style consistency, controllability, and creative expressiveness under a “one-code-one-style” paradigm. It is the first method to enable concise, reproducible, and high-precision numerical control over visual style in diffusion-based image synthesis.

Technology Category

Application Category

📝 Abstract

Innovative visual stylization is a cornerstone of artistic creation, yet generating novel and consistent visual styles remains a significant challenge. Existing generative approaches typically rely on lengthy textual prompts, reference images, or parameter-efficient fine-tuning to guide style-aware image generation, but often struggle with style consistency, limited creativity, and complex style representations. In this paper, we affirm that a style is worth one numerical code by introducing the novel task, code-to-style image generation, which produces images with novel, consistent visual styles conditioned solely on a numerical style code. To date, this field has only been primarily explored by the industry (e.g., Midjourney), with no open-source research from the academic community. To fill this gap, we propose CoTyle, the first open-source method for this task. Specifically, we first train a discrete style codebook from a collection of images to extract style embeddings. These embeddings serve as conditions for a text-to-image diffusion model (T2I-DM) to generate stylistic images. Subsequently, we train an autoregressive style generator on the discrete style embeddings to model their distribution, allowing the synthesis of novel style embeddings. During inference, a numerical style code is mapped to a unique style embedding by the style generator, and this embedding guides the T2I-DM to generate images in the corresponding style. Unlike existing methods, our method offers unparalleled simplicity and diversity, unlocking a vast space of reproducible styles from minimal input. Extensive experiments validate that CoTyle effectively turns a numerical code into a style controller, demonstrating a style is worth one code.

Problem

Research questions and friction points this paper is trying to address.

Generating novel consistent visual styles without complex inputs

Overcoming style inconsistency in existing image generation methods

Creating reproducible styles from minimal numerical code input

Innovation

Methods, ideas, or system contributions that make the work stand out.

Discrete style codebook extracts visual style embeddings

Autoregressive generator models novel style embedding distribution

Numerical style code guides diffusion model for generation

🔎 Similar Papers

StyleShot: A Snapshot on Any Style