CoLoGen: Progressive Learning of Concept`-`Localization Duality for Unified Image Generation

📅 2026-02-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Unified conditional image generation struggles to reconcile conflicting requirements between heterogeneous representations—such as conceptual understanding and spatial localization—within a single shared latent space. To address this challenge, this work proposes CoLoGen, a novel framework that introduces concept–localization duality modeling and a Progressive Representation Weaving (PRW) mechanism. Through curriculum-based, stage-wise learning, CoLoGen dynamically routes features to specialized expert modules, enabling joint optimization of both capabilities. Built upon a diffusion model architecture, the proposed method achieves state-of-the-art or superior performance across diverse tasks including image editing, controllable generation, and customized synthesis, thereby demonstrating its effectiveness and generalizability.

Technology Category

Application Category

📝 Abstract
Unified conditional image generation remains difficult because different tasks depend on fundamentally different internal representations. Some require conceptual understanding for semantic synthesis, while others rely on localization cues for spatial precision. Forcing these heterogeneous tasks to share a single representation leads to concept`-`localization representational conflict. To address this issue, we propose CoLoGen, a unified diffusion framework that progressively learns and reconciles this concept`-`localization duality. CoLoGen uses a staged curriculum that first builds core conceptual and localization abilities, then adapts them to diverse visual conditions, and finally refines their synergy for complex instruction`-`driven tasks. Central to this process is the Progressive Representation Weaving (PRW) module, which dynamically routes features to specialized experts and stably integrates their outputs across stages. Experiments on editing, controllable generation, and customized generation show that CoLoGen achieves competitive or superior performance, offering a principled representational perspective for unified image generation.
Problem

Research questions and friction points this paper is trying to address.

unified image generation
concept-localization duality
representational conflict
conditional image generation
diffusion models
Innovation

Methods, ideas, or system contributions that make the work stand out.

concept-localization duality
progressive representation weaving
unified image generation
diffusion framework
staged curriculum learning
🔎 Similar Papers
No similar papers found.