🤖 AI Summary
This study investigates whether large language models genuinely comprehend modern Chinese poetry beyond merely generating or translating it. To this end, we propose a multidimensional evaluation framework that introduces, for the first time, deep-level metrics such as alignment with the poet’s intended meaning and fidelity in capturing poetic essence. Leveraging collaborative assessments by professional poets, semantic analyses across multiple dimensions, and controlled human-subject experiments, we establish a reproducible evaluation paradigm. Our systematic evaluation of ChatGPT’s interpretive capabilities reveals that it accurately reconstructs the poet’s original intent in over 73% of cases; however, it still exhibits notable limitations in higher-order dimensions of poetic understanding, particularly in grasping nuanced poeticity.
📝 Abstract
ChatGPT has demonstrated remarkable capabilities on both poetry generation and translation, yet its ability to truly understand poetry remains unexplored. Previous poetry-related work merely analyzed experimental outcomes without addressing fundamental issues of comprehension. This paper introduces a comprehensive framework for evaluating ChatGPT's understanding of modern poetry. We collaborated with professional poets to evaluate ChatGPT's interpretation of modern Chinese poems by different poets along multiple dimensions. Evaluation results show that ChatGPT's interpretations align with the original poets' intents in over 73% of the cases. However, its understanding in certain dimensions, particularly in capturing poeticity, proved to be less satisfactory. These findings highlight the effectiveness and necessity of our proposed framework. This study not only evaluates ChatGPT's ability to understand modern poetry but also establishes a solid foundation for future research on LLMs and their application to poetry-related tasks.