🤖 AI Summary
AIGC faces critical challenges in news image generation—including misinformation propagation, semantic distortion, and lack of explainability—due to the inherent opacity and limited controllability of prevailing “black-box” models. To address these issues in news feature reporting, this work proposes a controllable and trustworthy AIGC image generation framework. Methodologically, it introduces the first three-dimensional evaluation system tailored for news features (CIS/CEA/U-PA), designs a human-in-the-loop modular pipeline enabling semantic traceability, editable intervention, and content verifiability, and integrates SAM/GroundingDINO-based segmentation, BrushNet-enabled semantic alignment, Style-LoRA-driven stylistic control, Prompt-to-Prompt conditioning, CLIP-based semantic scoring, and multi-stage content filtering. Experimental results from real-world media deployments demonstrate a 37.2% improvement in semantic fidelity, cultural expression accuracy exceeding 91.5%, and end-to-end provenance tracing via content credentials.
📝 Abstract
Artificial Intelligence Generated Content (AIGC) assisting image production triggers controversy in journalism while attracting attention from media agencies. Key issues involve misinformation, authenticity, semantic fidelity, and interpretability. Most AIGC tools are opaque "black boxes," hindering the dual demands of content accuracy and semantic alignment and creating ethical, sociotechnical, and trust dilemmas. This paper explores pathways for controllable image production in journalism's special coverage and conducts two experiments with projects from China's media agency: (1) Experiment 1 tests cross-platform adaptability via standardized prompts across three scenes, revealing disparities in semantic alignment, cultural specificity, and visual realism driven by training-corpus bias and platform-level filtering. (2) Experiment 2 builds a human-in-the-loop modular pipeline combining high-precision segmentation (SAM, GroundingDINO), semantic alignment (BrushNet), and style regulating (Style-LoRA, Prompt-to-Prompt), ensuring editorial fidelity through CLIP-based semantic scoring, NSFW/OCR/YOLO filtering, and verifiable content credentials. Traceable deployment preserves semantic representation. Consequently, we propose a human-AI collaboration mechanism for AIGC assisted image production in special coverage and recommend evaluating Character Identity Stability (CIS), Cultural Expression Accuracy (CEA), and User-Public Appropriateness (U-PA).