🤖 AI Summary
This study investigates how Korean–English code-switching and English usage in Billboard-charting K-pop songs (2017–2025) influence global dissemination, particularly in the U.S. market. Employing multilingual word embeddings fused with handcrafted linguistic features, we quantify lexical language proportions, code-switching frequency, and build classification models to examine associations between linguistic strategies and chart performance (Hot 100 vs. Global 200) as well as performer gender. Results reveal that English intensity significantly and positively predicts Hot 100 charting—more strongly than Global 200—supporting an English-dominant reception mechanism. Code-switching patterns exhibit cross-gender consistency yet substantial individual variation. A classifier achieves a macro-F1 score of 0.76 in predicting performer gender from linguistic features. This work establishes the first empirically grounded, quantitative framework linking language strategy to K-pop’s cross-cultural diffusion, offering testable hypotheses for global music reception research.
📝 Abstract
Code switching, particularly between Korean and English, has become a defining feature of modern K-pop, reflecting both aesthetic choices and global market strategies. This paper is a primary investigation into the linguistic strategies employed in K-pop songs that achieve global chart success, with a focus on the role of code-switching and English lyric usage. A dataset of K-pop songs that appeared on the Billboard Hot 100 and Global 200 charts from 2017 to 2025, spanning 14 groups and 8 solo artists, was compiled. Using this dataset, the proportion of English and Korean lyrics, the frequency of code-switching, and other stylistic features were analysed. It was found that English dominates the linguistic landscape of globally charting K-pop songs, with both male and female performers exhibiting high degrees of code-switching and English usage. Statistical tests indicated no significant gender-based differences, although female solo artists tend to favour English more consistently. A classification task was also performed to predict performer gender from lyrics, achieving macro F1 scores up to 0.76 using multilingual embeddings and handcrafted features. Finally, differences between songs charting on the Hot 100 versus the Global 200 were examined, suggesting that, while there is no significant gender difference in English, higher English usage may be more critical for success in the US-focused Hot 100. The findings highlight how linguistic choices in K-pop lyrics are shaped by global market pressures and reveal stylistic patterns that reflect performer identity and chart context.