Generate "Normal", Edit Poisoned: Branding Injection via Hint Embedding in Image Editing

๐Ÿ“… 2026-05-11
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

194K/year
๐Ÿค– AI Summary
This work uncovers a novel stealthy attack mechanism wherein adversaries embed imperceptible brand cuesโ€”such as logosโ€”into input images to covertly induce multi-turn image generation or editing models to automatically reproduce the embedded content on semantically related objects, thereby achieving implicit content injection. The attack is demonstrated to be effective in both phishing-style and data-poisoning scenarios, achieving average success rates of 44.4% and 32.2%, respectively. The authors introduce and validate the first prompt-embedding-based method for covert brand injection, integrating diffusion models, image editing, and invisible watermarking techniques. Furthermore, they propose an efficient defense strategy that mitigates the two attack variants with success rates of 87.4% and 92.3%, respectively.
๐Ÿ“ Abstract
With the rapid advancement of generative AI, users increasingly rely on image-generation models for image design and creation. To achieve faithful outputs, users typically engage in multi-turn interactions during image refinement: a text-to-image generation phase followed by a text-guided image-to-image editing phase. In this paper, we investigate a novel security vulnerability associated with such a workflow. Our key insight is that a nearly invisible hint, like branding information (e.g., a logo), embedded in an input image can be recognized by downstream generative models and subsequently re-rendered onto semantically related objects, even when the user prompt does not explicitly mention it. This form of hidden payload injection makes the attack stealthy. We study two realistic attack scenarios. The first is a phishing-based setting, in which an attacker controls an online image generation service and injects hidden content into generated images before they are returned to users. The second is a poison-based setting, where an attacker distributes a compromised text-to-image diffusion model whose output contains hidden content. We evaluate both attacks using six injected payloads, including well-known logos and customized designs, and demonstrate that the two attacks can achieve success rates of 44.4% and 32.2% on average, respectively, while ensuring the injected logos are visually imperceptible. We also develop a mitigation solution that achieves an average success rate of 87.4% and 92.3% against the phishing-based and poison-based attacks, respectively.
Problem

Research questions and friction points this paper is trying to address.

image editing
branding injection
hint embedding
generative AI
security vulnerability
Innovation

Methods, ideas, or system contributions that make the work stand out.

hint embedding
branding injection
image editing attack
diffusion model poisoning
stealthy payload
๐Ÿ”Ž Similar Papers
D
Desen Sun
University of Waterloo
J
Jason Hon
University of Waterloo
H
Howe Wang
University of Waterloo
S
Saarth Rajan
University of Waterloo
M
Meng Xu
University of Waterloo
Sihang Liu
Sihang Liu
Assistant Professor of School of Computer Science, University of Waterloo
Computer SystemsComputer ArchitectureSustainability