Twin Co-Adaptive Dialogue for Progressive Image Generation

๐Ÿ“… 2025-04-21
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Text-to-image generation often suffers from output misalignment due to ambiguous user prompts. To address this, we propose a dual-agent synchronous co-adaptive dialogue framework that models image generation as a dynamic, iterative human-AI collaborative optimization process: one agent performs conditional editing and latent-space feedback-driven fine-tuning, while the other models multi-turn dialogue semantics to resolve prompt ambiguity. Our approach is the first to enable joint co-adaptation of the generator and dialogue policy across both latent and semantic spacesโ€”without requiring additional annotations or task-specific pretraining. Experiments demonstrate that our method significantly reduces user trial-and-error iterations (by 42% on average), improves intent alignment and visual fidelity, and achieves state-of-the-art performance on multiple human-AI collaborative image generation benchmarks.

Technology Category

Application Category

๐Ÿ“ Abstract
Modern text-to-image generation systems have enabled the creation of remarkably realistic and high-quality visuals, yet they often falter when handling the inherent ambiguities in user prompts. In this work, we present Twin-Co, a framework that leverages synchronized, co-adaptive dialogue to progressively refine image generation. Instead of a static generation process, Twin-Co employs a dynamic, iterative workflow where an intelligent dialogue agent continuously interacts with the user. Initially, a base image is generated from the user's prompt. Then, through a series of synchronized dialogue exchanges, the system adapts and optimizes the image according to evolving user feedback. The co-adaptive process allows the system to progressively narrow down ambiguities and better align with user intent. Experiments demonstrate that Twin-Co not only enhances user experience by reducing trial-and-error iterations but also improves the quality of the generated images, streamlining the creative process across various applications.
Problem

Research questions and friction points this paper is trying to address.

Handling ambiguities in user prompts for image generation
Progressive refinement of images through synchronized dialogue
Reducing trial-and-error iterations to align with user intent
Innovation

Methods, ideas, or system contributions that make the work stand out.

Synchronized co-adaptive dialogue refines images
Dynamic iterative workflow with user feedback
Progressive ambiguity reduction aligns user intent
๐Ÿ”Ž Similar Papers
No similar papers found.