Disability Across Cultures: A Human-Centered Audit of Ableism in Western and Indic LLMs

📅 2025-07-21

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

This study investigates cultural biases in large language models (LLMs) when detecting ableist speech within the Indian sociocultural context. Method: We translated an English ableism dataset into Hindi, augmented it with ground-truth annotations from Indian disability communities, and employed prompt engineering to conduct cross-lingual, cross-cultural evaluation across eight models—including GPT-4, Gemini, and India-developed Krutrim. Contribution/Results: Western models consistently overestimated harm, while Indian models tended to underestimate it; all models exhibited heightened permissiveness toward ableist content in Hindi, revealing misalignment between their training data and culturally grounded evaluation criteria. This work presents the first systematic analysis of geographically situated bias in AI-based ableism detection. It proposes a community-informed, locally anchored evaluation framework to advance inclusive, responsible AI development for the Global South.

Technology Category

Application Category

📝 Abstract

People with disabilities (PwD) experience disproportionately high levels of discrimination and hate online, particularly in India, where entrenched stigma and limited resources intensify these challenges. Large language models (LLMs) are increasingly used to identify and mitigate online hate, yet most research on online ableism focuses on Western audiences with Western AI models. Are these models adequately equipped to recognize ableist harm in non-Western places like India? Do localized, Indic language models perform better? To investigate, we adopted and translated a publicly available ableist speech dataset to Hindi, and prompted eight LLMs--four developed in the U.S. (GPT-4, Gemini, Claude, Llama) and four in India (Krutrim, Nanda, Gajendra, Airavata)--to score and explain ableism. In parallel, we recruited 175 PwD from both the U.S. and India to perform the same task, revealing stark differences between groups. Western LLMs consistently overestimated ableist harm, while Indic LLMs underestimated it. Even more concerning, all LLMs were more tolerant of ableism when it was expressed in Hindi and asserted Western framings of ableist harm. In contrast, Indian PwD interpreted harm through intention, relationality, and resilience--emphasizing a desire to inform and educate perpetrators. This work provides groundwork for global, inclusive standards of ableism, demonstrating the need to center local disability experiences in the design and evaluation of AI systems.

Problem

Research questions and friction points this paper is trying to address.

Assessing Western and Indic LLMs' ability to recognize ableist harm in non-Western contexts

Comparing performance of localized Indic LLMs versus Western LLMs in detecting ableism

Highlighting cultural differences in interpreting ableist harm between Western and Indian perspectives

Innovation

Methods, ideas, or system contributions that make the work stand out.

Translated ableist speech dataset to Hindi

Compared Western and Indic LLMs' ableism detection

Centered local disability experiences in AI evaluation

🔎 Similar Papers

"Cold, Calculated, and Condescending": How AI Identifies and Explains Ableism Compared to Disabled People