๐ค AI Summary
To address the trust crisis and regulatory challenges surrounding AI-generated content (AIGC), existing detection tools remain fragmented and lack support for visible compliance labeling. This paper introduces the first open-source, unified toolkit for multimodal AIGC governance. It pioneers a dual-strategy framework integrating imperceptible watermarking (for copyright protection) and visible labeling (for regulatory compliance). We design a cross-modal unified abstraction engine and a standardized tri-modal evaluation benchmarkโImage/Video/Audio-Bench. The toolkit incorporates multimodal feature alignment, lightweight neural watermark embedding, interpretable label rendering, and a modular inference architecture. It achieves state-of-the-art performance across all three benchmarks, delivering high accuracy, low latency, and verifiability. The toolkit is publicly released and has been deployed in real-world regulatory auditing scenarios.
๐ Abstract
The rapid proliferation of Artificial Intelligence Generated Content has precipitated a crisis of trust and urgent regulatory demands. However, existing identification tools suffer from fragmentation and a lack of support for visible compliance marking. To address these gaps, we introduce the extbf{UniMark}, an open-source, unified framework for multimodal content governance. Our system features a modular unified engine that abstracts complexities across text, image, audio, and video modalities. Crucially, we propose a novel dual-operation strategy, natively supporting both emph{Hidden Watermarking} for copyright protection and emph{Visible Marking} for regulatory compliance. Furthermore, we establish a standardized evaluation framework with three specialized benchmarks (Image/Video/Audio-Bench) to ensure rigorous performance assessment. This toolkit bridges the gap between advanced algorithms and engineering implementation, fostering a more transparent and secure digital ecosystem.