🤖 AI Summary
This study addresses the pervasive issues of low response diversity, negative sentiment bias, and social stereotyping in Chinese AI systems—specifically Baidu Search, ERNIE Bot, and Qwen. It presents the first cross-platform empirical evaluation, leveraging 240 fine-grained identity prompts spanning 13 socially salient demographic groups in China. Over 30,000 model responses were collected and analyzed using semantic diversity metrics, sentiment polarity analysis, stereotype keyword annotation, and rigorous statistical testing. The work introduces the first multidimensional bias assessment framework tailored to the Chinese linguistic and sociocultural context. Key findings reveal fundamental differences across systems: large language models exhibit higher opinion diversity than search engines; Baidu and Qwen generate significantly more negative content than ERNIE Bot; and all three systems manifest moderate-strength stereotypes, some of which are offensive. These results provide an empirical foundation and methodological toolkit for advancing fairness governance of Chinese-language AI systems.
📝 Abstract
Large Language Models (LLMs) and search engines have the potential to perpetuate biases and stereotypes by amplifying existing prejudices in their training data and algorithmic processes, thereby influencing public perception and decision-making. While most work has focused on Western-centric AI technologies, we study Chinese-based tools by investigating social biases embedded in the major Chinese search engine, Baidu, and two leading LLMs, Ernie and Qwen. Leveraging a dataset of 240 social groups across 13 categories describing Chinese society, we collect over 30k views encoded in the aforementioned tools by prompting them for candidate words describing such groups. We find that language models exhibit a larger variety of embedded views compared to the search engine, although Baidu and Qwen generate negative content more often than Ernie. We also find a moderate prevalence of stereotypes embedded in the language models, many of which potentially promote offensive and derogatory views. Our work highlights the importance of promoting fairness and inclusivity in AI technologies with a global perspective.