Seeing the Poem: Image-Semantic Detection of AI-Generated Modern Chinese Poetry with MLLMs

📅 2026-05-21
📈 Citations: 0
Influential: 0
📄 PDF

career value

195K/year
🤖 AI Summary
This study addresses the limited performance of existing large language models in detecting AI-generated modern Chinese poetry and the absence of multimodal approaches incorporating visual semantics. To bridge this gap, the work proposes a novel image-semantic-guided, example-driven multimodal detection method that leverages multimodal large language models—such as Gemini—to jointly model complementary information between text and images at the levels of meaning, imagery, and emotion. Experimental results demonstrate that the proposed approach significantly outperforms text-only baselines across multiple LLM-generated poetry datasets. Notably, Gemini achieves a Macro-F1 score of 85.65%, surpassing strong unimodal detectors like RoBERTa and establishing a new state-of-the-art performance in this task.
📝 Abstract
Previous detection studies have shown that LLMs cannot be effectively used as detectors, but these studies have not addressed modern Chinese poetry. Moreover, no relevant research has explored the performance of LLMs in detecting modern Chinese poetry. This paper evaluates and enhances the performance of LLMs as detectors for modern Chinese poetry, and proposes an image-semantic guided poetry detection method. Compared with traditional detection approaches, our method innovatively incorporates images that reflect the content of the poetry. Through example-driven approaches, our method effectively integrates information such as meaning, imagery, and feeling from the image, then forms a complementary judgment with the poem text. Experimental results demonstrate that the LLM detectors based on our method outperform baseline detectors based on plain text, and even surpass the best-performing traditional detector, RoBERTa. The Gemini detector using our method achieves a Macro-F1 score of 85.65%, reaching the state-of-the-art level. The performance improvements of different LLM detectors on multiple LLMs-generated data prove the effectiveness of our method.
Problem

Research questions and friction points this paper is trying to address.

AI-generated poetry
modern Chinese poetry
LLM detection
image-semantic detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

image-semantic guidance
multimodal detection
AI-generated poetry
large language models
modern Chinese poetry