Where Should I Study? Biased Language Models Decide! Evaluating Fairness in LMs for Academic Recommendations

📅 2025-09-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study systematically evaluates multidimensional biases—geographic, demographic, and socioeconomic—of large language models (LLMs) in academic recommendation tasks. We propose the first fairness-aware evaluation framework tailored to educational recommendation, moving beyond conventional accuracy metrics to quantify representation bias in university and major recommendations, imbalance in Global North–South institutional coverage, and gender stereotyping. Using LLaMA-3.1-8B, Gemma-7B, and Mistral-7B, we generate over 25,000 recommendations for 360 simulated users characterized by diverse gender, nationality, and socioeconomic backgrounds. Results reveal pronounced systemic biases: strong preference for Global North institutions, reinforcement of gender stereotypes, and high recommendation redundancy. Although LLaMA-3.1 achieves the broadest coverage (481 universities across 58 countries), it still exhibits significant inequities. Our work establishes a reproducible, empirically grounded assessment paradigm to advance fair governance of LLMs in education.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) are increasingly used as daily recommendation systems for tasks like education planning, yet their recommendations risk perpetuating societal biases. This paper empirically examines geographic, demographic, and economic biases in university and program suggestions from three open-source LLMs: LLaMA-3.1-8B, Gemma-7B, and Mistral-7B. Using 360 simulated user profiles varying by gender, nationality, and economic status, we analyze over 25,000 recommendations. Results show strong biases: institutions in the Global North are disproportionately favored, recommendations often reinforce gender stereotypes, and institutional repetition is prevalent. While LLaMA-3.1 achieves the highest diversity, recommending 481 unique universities across 58 countries, systemic disparities persist. To quantify these issues, we propose a novel, multi-dimensional evaluation framework that goes beyond accuracy by measuring demographic and geographic representation. Our findings highlight the urgent need for bias consideration in educational LMs to ensure equitable global access to higher education.
Problem

Research questions and friction points this paper is trying to address.

Evaluating fairness in LM academic recommendation systems
Assessing geographic, demographic, economic biases in university suggestions
Measuring systemic disparities in educational LM recommendations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-dimensional evaluation framework for bias measurement
Simulated user profiles varying demographic attributes
Analyzing geographic and demographic representation in recommendations
🔎 Similar Papers
No similar papers found.
K
Krithi Shailya
Centre for Responsible AI (CeRAI), Wadhwani School of Data Science and AI (WSAI), Indian Institute of Technology Madras
A
Akhilesh Kumar Mishra
Centre for Responsible AI (CeRAI), Wadhwani School of Data Science and AI (WSAI), Indian Institute of Technology Madras
Gokul S Krishnan
Gokul S Krishnan
Senior Research Scientist, CeRAI, IIT Madras
Natural Language ProcessingMachine LearningData ScienceHealthcare Informatics
Balaraman Ravindran
Balaraman Ravindran
Professor of Data Science and AI, Wadhwani School of Data Science and AI, IIT Madras
Reinforcement LearningData MiningNetwork AnalysisResponsible AI