Discovering Bias Associations through Open-Ended LLM Generations

📅 2025-08-02

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

Existing LLM bias evaluation methods rely on predefined identity-concept templates, limiting their ability to detect unknown or implicit social biases in open-ended generation. To address this, we propose the Bias Association Discovery Framework (BADF), the first systematic framework enabling automatic discovery of previously unknown bias associations from unconstrained model outputs. BADF integrates prompt engineering, semantic clustering, quantitative association strength measurement, and cross-model validation to achieve end-to-end, interpretable bias pattern identification. Extensive experiments across mainstream large language models and diverse real-world scenarios demonstrate that BADF not only recovers well-documented biases—such as gender-occupation stereotypes—but also uncovers novel, previously unreported associations—e.g., region-morality directional biases. The framework’s implementation, along with annotated datasets and evaluation scripts, is publicly released to foster reproducible bias research.

Technology Category

Application Category

📝 Abstract

Social biases embedded in Large Language Models (LLMs) raise critical concerns, resulting in representational harms -- unfair or distorted portrayals of demographic groups -- that may be expressed in subtle ways through generated language. Existing evaluation methods often depend on predefined identity-concept associations, limiting their ability to surface new or unexpected forms of bias. In this work, we present the Bias Association Discovery Framework (BADF), a systematic approach for extracting both known and previously unrecognized associations between demographic identities and descriptive concepts from open-ended LLM outputs. Through comprehensive experiments spanning multiple models and diverse real-world contexts, BADF enables robust mapping and analysis of the varied concepts that characterize demographic identities. Our findings advance the understanding of biases in open-ended generation and provide a scalable tool for identifying and analyzing bias associations in LLMs. Data, code, and results are available at https://github.com/JP-25/Discover-Open-Ended-Generation

Problem

Research questions and friction points this paper is trying to address.

Detects hidden social biases in LLM outputs

Identifies new bias associations beyond predefined categories

Provides scalable analysis of demographic identity portrayals

Innovation

Methods, ideas, or system contributions that make the work stand out.

Open-ended LLM generations for bias discovery

Systematic framework for unknown bias associations

Scalable tool for analyzing demographic biases

🔎 Similar Papers

No similar papers found.