Efficient High-Resolution Image Editing with Hallucination-Aware Loss and Adaptive Tiling

📅 2025-10-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of efficiently deploying high-resolution (4K) diffusion-based image editing on resource-constrained devices—particularly mobile platforms—this paper proposes a three-stage lightweight editing framework. Our method introduces: (1) a hallucination-aware loss function that explicitly suppresses generation artifacts; (2) latent-space projection coupled with an adaptive context-preserving tiling strategy to jointly maintain local fidelity and global coherence; and (3) a tiled upsampling mechanism that substantially reduces GPU memory consumption and computational overhead. Experiments demonstrate that our approach preserves editing quality while achieving 18–48% improvement in PSNR, 14–51% reduction in hallucination artifacts, and a 55.8× speedup over the A100-based baseline. To our knowledge, this is the first method enabling real-time, high-quality 4K image editing on mobile devices.

Technology Category

Application Category

📝 Abstract
High-resolution (4K) image-to-image synthesis has become increasingly important for mobile applications. Existing diffusion models for image editing face significant challenges, in terms of memory and image quality, when deployed on resource-constrained devices. In this paper, we present MobilePicasso, a novel system that enables efficient image editing at high resolutions, while minimising computational cost and memory usage. MobilePicasso comprises three stages: (i) performing image editing at a standard resolution with hallucination-aware loss, (ii) applying latent projection to overcome going to the pixel space, and (iii) upscaling the edited image latent to a higher resolution with adaptive context-preserving tiling. Our user study with 46 participants reveals that MobilePicasso not only improves image quality by 18-48% but reduces hallucinations by 14-51% over existing methods. MobilePicasso demonstrates significantly lower latency, e.g., up to 55.8$ imes$ speed-up, yet with a small increase in runtime memory, e.g., a mere 9% increase over prior work. Surprisingly, the on-device runtime of MobilePicasso is observed to be faster than a server-based high-resolution image editing model running on an A100 GPU.
Problem

Research questions and friction points this paper is trying to address.

Enables efficient high-resolution image editing on mobile devices
Reduces computational cost and memory usage for image synthesis
Minimizes hallucinations and improves quality in 4K image editing
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hallucination-aware loss improves image editing quality
Latent projection avoids pixel space conversion overhead
Adaptive tiling enables efficient high-resolution upscaling
🔎 Similar Papers
No similar papers found.
Young D. Kwon
Young D. Kwon
Samsung AI Center-Cambridge; University of Cambridge
Machine Learning SystemsGenerative AIContinual LearningMeta-LearningMobile Systems
Abhinav Mehrotra
Abhinav Mehrotra
Samsung AI Center
GenAIAutoMLSSLMachine Learning
M
Malcolm Chadwick
Samsung AI Center-Cambridge
A
Alberto Gil Ramos
Samsung AI Center-Cambridge
S
Sourav Bhattacharya
Samsung AI Center-Cambridge