🤖 AI Summary
This study addresses the challenge of sentiment recognition in contemporary Korean poetry, where metaphorical language and cultural specificity impede accurate classification by existing models. Methodologically, we introduce KPoEM—the first Korean poetry sentiment computation dataset—comprising 7,662 poems annotated with 44 fine-grained, culture-adapted sentiment categories. We further propose a two-stage fine-tuning strategy: first pretraining a Korean large language model on general-domain Korean corpora, then domain-specific fine-tuning on KPoEM. Experimental results show that our model achieves a micro-F1 score of 0.60, substantially outperforming baseline models (0.34), and effectively captures nuanced poetic sentiment and culturally embedded expressions. Key contributions include: (1) the first manually curated, sentiment-annotated Korean poetry dataset; (2) a culturally grounded, fine-grained sentiment taxonomy; and (3) a novel paradigm for sentiment computation tailored to literary texts.
📝 Abstract
This study introduces KPoEM (Korean Poetry Emotion Mapping) , a novel dataset for computational emotion analysis in modern Korean poetry. Despite remarkable progress in text-based emotion classification using large language models, poetry-particularly Korean poetry-remains underexplored due to its figurative language and cultural specificity. We built a multi-label emotion dataset of 7,662 entries, including 7,007 line-level entries from 483 poems and 615 work-level entries, annotated with 44 fine-grained emotion categories from five influential Korean poets. A state-of-the-art Korean language model fine-tuned on this dataset significantly outperformed previous models, achieving 0.60 F1-micro compared to 0.34 from models trained on general corpora. The KPoEM model, trained through sequential fine-tuning-first on general corpora and then on the KPoEM dataset-demonstrates not only an enhanced ability to identify temporally and culturally specific emotional expressions, but also a strong capacity to preserve the core sentiments of modern Korean poetry. This study bridges computational methods and literary analysis, presenting new possibilities for the quantitative exploration of poetic emotions through structured data that faithfully retains the emotional and cultural nuances of Korean literature.