CCiV: A Benchmark for Structure, Rhythm and Quality in LLM-Generated Chinese \textit{Ci} Poetry

📅 2026-02-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge large language models face in simultaneously satisfying metrical constraints, tonal patterns, and literary quality when generating classical Chinese ci poetry. The work proposes CCiV, the first fine-grained evaluation benchmark tailored to variant forms of Chinese ci lyrics, comprising a test set derived from 30 canonical ci patterns. Seventeen large language models are comprehensively assessed across three dimensions: structural compliance, tonal adherence, and artistic merit. Findings reveal that while models generally conform well to structural rules, they exhibit significant deficiencies in tonal control. Moreover, metrical correctness shows no consistent correlation with literary quality. The study further demonstrates that form-aware prompting enhances metrical performance in stronger models but may degrade results for weaker ones. This work establishes a systematic evaluation framework and offers nuanced insights into the generation of classical Chinese poetry.

Technology Category

Application Category

📝 Abstract
The generation of classical Chinese \textit{Ci} poetry, a form demanding a sophisticated blend of structural rigidity, rhythmic harmony, and artistic quality, poses a significant challenge for large language models (LLMs). To systematically evaluate and advance this capability, we introduce \textbf{C}hinese \textbf{Ci}pai \textbf{V}ariants (\textbf{CCiV}), a benchmark designed to assess LLM-generated \textit{Ci} poetry across these three dimensions: structure, rhythm, and quality. Our evaluation of 17 LLMs on 30 \textit{Cipai} reveals two critical phenomena: models frequently generate valid but unexpected historical variants of a poetic form, and adherence to tonal patterns is substantially harder than structural rules. We further show that form-aware prompting can improve structural and tonal control for stronger models, while potentially degrading weaker ones. Finally, we observe weak and inconsistent alignment between formal correctness and literary quality in our sample. CCiV highlights the need for variant-aware evaluation and more holistic constrained creative generation methods.
Problem

Research questions and friction points this paper is trying to address.

Chinese Ci poetry
large language models
structural rigidity
rhythmic harmony
literary quality
Innovation

Methods, ideas, or system contributions that make the work stand out.

Chinese Ci poetry
LLM benchmark
tonal pattern
Cipai variants
constrained generation
🔎 Similar Papers
No similar papers found.
Shangqing Zhao
Shangqing Zhao
University of Oklahoma
wireless networkingnetwork securitywireless communication
Yupei Ren
Yupei Ren
East China Normal University
Argument MiningIntelligent Education
Y
Yuhao Zhou
School of Computer Science and Technology, East China Normal University, China
X
Xiaopeng Bai
Shanghai Institute of Artificial Intelligence for Education, East China Normal University, China; Department of Chinese Language and Literature, East China Normal University, China
Man Lan
Man Lan
East China Normal University,School of Computer Science and Technology
NLP