IndiCASA: A Dataset and Bias Evaluation Framework in LLMs Using Contrastive Embedding Similarity in the Indian Context

📅 2025-10-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the challenge that existing embedding-based bias evaluation methods fail to effectively detect implicit social biases—along five dimensions (caste, gender, religion, disability, and socioeconomic status)—embedded in large language models (LLMs) within India’s multicultural context. To this end, we propose IndiBias, the first fine-grained, India-contextualized bias evaluation framework. Our contributions are threefold: (1) We construct IndiCASA, a manually validated, culturally adapted dataset comprising 2,575 context-aligned sentence pairs; (2) We introduce a contrastive learning–based embedding similarity quantification method, augmented with counterfactual samples, to measure multidimensional bias; and (3) We systematically evaluate leading open-source LLMs, revealing that disability-related bias is most pronounced, while religious bias is comparatively lowest. These findings underscore the critical importance of localized, culturally grounded bias assessment frameworks for advancing model fairness and equity in diverse sociolinguistic settings.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) have gained significant traction across critical domains owing to their impressive contextual understanding and generative capabilities. However, their increasing deployment in high stakes applications necessitates rigorous evaluation of embedded biases, particularly in culturally diverse contexts like India where existing embedding-based bias assessment methods often fall short in capturing nuanced stereotypes. We propose an evaluation framework based on a encoder trained using contrastive learning that captures fine-grained bias through embedding similarity. We also introduce a novel dataset - IndiCASA (IndiBias-based Contextually Aligned Stereotypes and Anti-stereotypes) comprising 2,575 human-validated sentences spanning five demographic axes: caste, gender, religion, disability, and socioeconomic status. Our evaluation of multiple open-weight LLMs reveals that all models exhibit some degree of stereotypical bias, with disability related biases being notably persistent, and religion bias generally lower likely due to global debiasing efforts demonstrating the need for fairer model development.
Problem

Research questions and friction points this paper is trying to address.

Evaluating embedded biases in LLMs within Indian cultural contexts
Assessing nuanced stereotypes across caste, gender, religion, disability and socioeconomic status
Developing framework using contrastive embedding similarity for fine-grained bias detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Contrastive learning encoder for bias evaluation
Human-validated dataset across five demographic axes
Embedding similarity captures fine-grained cultural biases
🔎 Similar Papers
No similar papers found.