Language models for longitudinal analysis of abusive content in Billboard Music Charts

📅 2025-10-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates long-term trends in derogatory and sexually suggestive content in Billboard Hot 100 lyrics from 1950 to 2020, aiming to assess potential developmental impacts on children and adolescents and inform evidence-based content regulation policies. We employ an automated analytical framework grounded in pre-trained language models (BERT and RoBERTa), integrating fine-grained sentiment analysis with multi-class abusive content detection—enabling, for the first time, a systematic, large-scale, longitudinal quantitative analysis spanning seven decades. Results reveal a marked increase beginning in the 1990s: profanity, objectifying language, and explicit sexual references rose three- to fivefold. Crucially, the models robustly capture micro-level shifts in sociolinguistic norms over time. This work establishes a reproducible methodological paradigm for cultural content governance and delivers empirically grounded insights for policy development in media regulation.

Technology Category

Application Category

📝 Abstract
There is no doubt that there has been a drastic increase in abusive and sexually explicit content in music, particularly in Billboard Music Charts. However, there is a lack of studies that validate the trend for effective policy development, as such content has harmful behavioural changes in children and youths. In this study, we utilise deep learning methods to analyse songs (lyrics) from Billboard Charts of the United States in the last seven decades. We provide a longitudinal study using deep learning and language models and review the evolution of content using sentiment analysis and abuse detection, including sexually explicit content. Our results show a significant rise in explicit content in popular music from 1990 onwards. Furthermore, we find an increasing prevalence of songs with lyrics containing profane, sexually explicit, and otherwise inappropriate language. The longitudinal analysis of the ability of language models to capture nuanced patterns in lyrical content, reflecting shifts in societal norms and language use over time.
Problem

Research questions and friction points this paper is trying to address.

Analyzing abusive content trends in Billboard Music Charts
Validating explicit content increase using deep learning methods
Tracking lyrical content evolution through longitudinal language analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep learning analyzes Billboard song lyrics
Language models track abusive content evolution
Sentiment analysis detects explicit content trends
🔎 Similar Papers
No similar papers found.
Rohitash Chandra
Rohitash Chandra
UNSW
Bayesian deep learningNeuroevolutionClimate ExtremesLanguage ModelsComparative Religion
Y
Yathin Suresh
Transitional Artificial Intelligence Research Group, School of Mathematics and Statistics, UNSW Sydney, NSW 2006, Australia
D
Divyansh Raj Sinha
Department of Electrical Engineering, Indian Institute of Technology Delhi, Delhi, India
S
Sanchit Jindal
Department of Electrical Engineering, Indian Institute of Technology Delhi, Delhi, India