Probing Cultural Signals in Large Language Models through Author Profiling

📅 2026-03-17

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

This study investigates implicit cultural biases in large language models (LLMs) when inferring author gender and ethnicity from song lyrics under zero-shot settings. Analyzing over 10,000 lyrics, the work introduces two novel fairness metrics—Modality Accuracy Disparity (MAD) and Recall Disparity (RD)—and combines predictive distribution analysis with generated rationale interpretation to systematically assess cultural alignment across multiple open-source LLMs. Results reveal a predominant default bias toward North American ethnicities among most models, with DeepSeek-1.5B showing a relative inclination toward Asian ethnicities, Mistral-8B exhibiting the most pronounced bias, and Gemma-12B demonstrating the most balanced performance. This work presents the first quantitative characterization of cultural bias in LLMs for ethnicity profiling tasks.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) are increasingly deployed in applications with societal impact, raising concerns about the cultural biases they encode. We probe these representations by evaluating whether LLMs can perform author profiling from song lyrics in a zero-shot setting, inferring singers' gender and ethnicity without task-specific fine-tuning. Across several open-source models evaluated on more than 10,000 lyrics, we find that LLMs achieve non-trivial profiling performance but demonstrate systematic cultural alignment: most models default toward North American ethnicity, while DeepSeek-1.5B aligns more strongly with Asian ethnicity. This finding emerges from both the models' prediction distributions and an analysis of their generated rationales. To quantify these disparities, we introduce two fairness metrics, Modality Accuracy Divergence (MAD) and Recall Divergence (RD), and show that Ministral-8B displays the strongest ethnicity bias among the evaluated models, whereas Gemma-12B shows the most balanced behavior. Our code is available on GitHub (https://github.com/ValentinLafargue/CulturalProbingLLM).

Problem

Research questions and friction points this paper is trying to address.

cultural bias

large language models

author profiling

fairness

ethnicity

Innovation

Methods, ideas, or system contributions that make the work stand out.

author profiling

cultural bias

zero-shot inference