Toward Accountable AI-Generated Content on Social Platforms: Steganographic Attribution and Multimodal Harm Detection

πŸ“… 2026-04-12
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenge of content misuse involving AI-generated images paired with harmful text, which often evades conventional moderation due to the absence of traceable metadata. The authors propose an end-to-end forensic framework that embeds a cryptographically signed watermark during image generation and integrates multimodal harmful content detection to trigger provenance verification. Innovatively co-designing steganography with multimodal harm detection, the approach establishes a cross-modal, triggerable accountability mechanism for AI-generated content. Experimental results demonstrate that the adopted spread-spectrum watermark in the wavelet domain exhibits strong robustness against blurring perturbations, while the CLIP-based multimodal detector achieves an AUC-ROC of 0.99, significantly enhancing the reliability of content attribution.

Technology Category

Application Category

πŸ“ Abstract
The rapid growth of generative AI has introduced new challenges in content moderation and digital forensics. In particular, benign AI-generated images can be paired with harmful or misleading text, creating difficult-to-detect misuse. This contextual misuse undermines the traditional moderation framework and complicates attribution, as synthetic images typically lack persistent metadata or device signatures. We introduce a steganography enabled attribution framework that embeds cryptographically signed identifiers into images at creation time and uses multimodal harmful content detection as a trigger for attribution verification. Our system evaluates five watermarking methods across spatial, frequency, and wavelet domains. It also integrates a CLIP-based fusion model for multimodal harmful-content detection. Experiments demonstrate that spread-spectrum watermarking, especially in the wavelet domain, provides strong robustness under blur distortions, and our multimodal fusion detector achieves an AUC-ROC of 0.99, enabling reliable cross-modal attribution verification. These components form an end-to-end forensic pipeline that enables reliable tracing of harmful deployments of AI-generated imagery, supporting accountability in modern synthetic media environments. Our code is available at GitHub: https://github.com/bli1/steganography
Problem

Research questions and friction points this paper is trying to address.

AI-generated content
content moderation
attribution
multimodal harm detection
synthetic media
Innovation

Methods, ideas, or system contributions that make the work stand out.

steganographic attribution
multimodal harm detection
AI-generated content
digital watermarking
CLIP-based fusion
πŸ”Ž Similar Papers
No similar papers found.
X
Xinlei Guan
Kean University, Union, NJ 07083 USA
D
David Arosemena
Kean University, Union, NJ 07083 USA
T
Tejaswi Dhandu
North Dakota State University, Fargo, ND 58102 USA
K
Kuan Huang
Kean University, Union, NJ 07083 USA
Meng Xu
Meng Xu
Assistant Professor of Computer Science, Kean University
Computer VisionDeep LearningMedical Image Analysis
M
Miles Q. Li
McGill University, MontrΓ©al, QC H3A 0G4, Canada
Bingyu Shen
Bingyu Shen
Meta Platforms, Inc
Software ReliabilitySecuritySoftware Engineering
Ruiyang Qin
Ruiyang Qin
Assistant Professor, Villanova University
hardware/software co-designdeep learning accelerationon-device AIAI for healthcare
Umamaheswara Rao Tida
Umamaheswara Rao Tida
Assistant Professor, North Dakota State University
3D ICDesign automation of VLSI systemsHigh-Performance Computing
B
Boyang Li
Kean University, Union, NJ 07083 USA