🤖 AI Summary
Existing concept-slider methods exhibit low editing accuracy and poor photorealism when applied to authentic, non-AIGC (i.e., real-world, camera-captured) images. To address this, we propose an adversarial image editing framework that synergistically integrates GANs and diffusion models, introducing a novel dual-guidance mechanism—combining text semantics and fine-grained visual features—to enable cross-category, high-fidelity, pixel-level manipulation. Unlike conventional slider-based approaches reliant on synthetic data, our method operates robustly on real-scene images, significantly improving both geometric precision and perceptual plausibility. Extensive experiments demonstrate superior performance across diverse editing tasks—including attribute modulation and localized inpainting—with higher PSNR, lower LPIPS, and better user preference scores than current state-of-the-art methods. This work establishes a new paradigm for controllable, high-quality editing of real-world photographs.
📝 Abstract
In the realm of image generation, the quest for realism and customization has never been more pressing. While existing methods like concept sliders have made strides, they often falter when it comes to no-AIGC images, particularly images captured in real world settings. To bridge this gap, we introduce Beyond Sliders, an innovative framework that integrates GANs and diffusion models to facilitate sophisticated image manipulation across diverse image categories. Improved upon concept sliders, our method refines the image through fine grained guidance both textual and visual in an adversarial manner, leading to a marked enhancement in image quality and realism. Extensive experimental validation confirms the robustness and versatility of Beyond Sliders across a spectrum of applications.