🤖 AI Summary
This study addresses the challenge of large-scale, cross-lingual variant analysis in folk narrative typology. Methodologically, it proposes an automated motif identification and comparative framework leveraging large language models (LLMs) integrated with NLP techniques to perform fine-grained motif extraction from multilingual *Cinderella* texts; structural similarity modeling and visualization across hundreds of variants are achieved via clustering and dimensionality reduction. Its key contributions are twofold: (1) the first deep integration of LLMs into computational folklore analysis, overcoming bottlenecks of manual coding; and (2) construction of a cross-lingually aligned motif vector space, markedly improving thematic consistency detection and motif variation pattern recognition. Experimental evaluation demonstrates the framework’s effectiveness in motif identification accuracy, cross-lingual comparability, and interpretability of cultural divergence—establishing a scalable, digital-humanities–driven methodological paradigm for folk narrative studies.
📝 Abstract
Artificial intelligence approaches are being adapted to many research areas, including digital humanities. We built a methodology for large-scale analyses in folkloristics. Using machine learning and natural language processing, we automatically detected motifs in a large collection of Cinderella variants and analysed their similarities and differences with clustering and dimensionality reduction. The results show that large language models detect complex interactions in tales, enabling computational analysis of extensive text collections and facilitating cross-lingual comparisons.