Large Language Models Discriminate Against Speakers of German Dialects

📅 2025-09-17

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

This study systematically uncovers, for the first time, dual biases in large language models (LLMs) toward German dialect speakers: *dialect-label bias* (e.g., misclassification or stigmatizing labels) and *dialect-use bias* (e.g., negative semantic associations and discriminatory decisions). Grounded in sociolinguistic theory, we construct a paired evaluation corpus covering seven German dialects and Standard German, and design association and decision-making tasks for quantitative bias assessment. Experiments reveal that all major LLMs exhibit significant pro-Standard-German bias, consistently associating dialect speech with negative adjectives and rendering unfair judgments. Crucially, explicitly labeling utterances as “dialect” does not mitigate bias—in fact, it amplifies discriminatory effects—challenging prior assumptions that only implicit bias matters. Our work establishes a novel paradigm and empirical benchmark for evaluating fairness in language technologies, advancing both methodological rigor and sociolinguistic awareness in AI ethics.

Technology Category

Application Category

📝 Abstract

Dialects represent a significant component of human culture and are found across all regions of the world. In Germany, more than 40% of the population speaks a regional dialect (Adler and Hansen, 2022). However, despite cultural importance, individuals speaking dialects often face negative societal stereotypes. We examine whether such stereotypes are mirrored by large language models (LLMs). We draw on the sociolinguistic literature on dialect perception to analyze traits commonly associated with dialect speakers. Based on these traits, we assess the dialect naming bias and dialect usage bias expressed by LLMs in two tasks: an association task and a decision task. To assess a model's dialect usage bias, we construct a novel evaluation corpus that pairs sentences from seven regional German dialects (e.g., Alemannic and Bavarian) with their standard German counterparts. We find that: (1) in the association task, all evaluated LLMs exhibit significant dialect naming and dialect usage bias against German dialect speakers, reflected in negative adjective associations; (2) all models reproduce these dialect naming and dialect usage biases in their decision making; and (3) contrary to prior work showing minimal bias with explicit demographic mentions, we find that explicitly labeling linguistic demographics--German dialect speakers--amplifies bias more than implicit cues like dialect usage.

Problem

Research questions and friction points this paper is trying to address.

Assessing bias in LLMs against German dialect speakers

Evaluating dialect naming and usage bias in association tasks

Examining bias amplification with explicit demographic labeling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Constructing dialect-standard German evaluation corpus

Analyzing dialect naming and usage bias in LLMs

Explicit linguistic demographic labeling amplifies bias

🔎 Similar Papers

Native Design Bias: Studying the Impact of English Nativeness on Language Model Performance