Poor Alignment and Steerability of Large Language Models: Evidence from College Admission Essays

📅 2025-03-25

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

Large language models (LLMs) exhibit severe alignment and steerability deficiencies in generating undergraduate admissions essays—a high-stakes educational application. Method: We quantitatively compare linguistic properties (word frequency, n-gram diversity, syntactic complexity) of LLM-generated essays—produced via two prompting strategies (essay prompt only vs. prompt + demographic attributes)—against 30,000 real human-authored applications across gender, race, first-generation status, and geographic dimensions. Results: All evaluated models (GPT, Claude, etc.) systematically deviate from human linguistic patterns, with statistically significant disparities across all demographic axes. Crucially, incorporating demographic information fails to improve stylistic fidelity to specific groups; inter-model similarity exceeds model-to-human similarity, confirming poor steerability. This study provides the first empirical evidence of inherent linguistic homogenization in LLMs within fairness-critical educational contexts, underscoring urgent concerns regarding their deployment in equitable decision-support systems.

Technology Category

Application Category

📝 Abstract

People are increasingly using technologies equipped with large language models (LLM) to write texts for formal communication, which raises two important questions at the intersection of technology and society: Who do LLMs write like (model alignment); and can LLMs be prompted to change who they write like (model steerability). We investigate these questions in the high-stakes context of undergraduate admissions at a selective university by comparing lexical and sentence variation between essays written by 30,000 applicants to two types of LLM-generated essays: one prompted with only the essay question used by the human applicants; and another with additional demographic information about each applicant. We consistently find that both types of LLM-generated essays are linguistically distinct from human-authored essays, regardless of the specific model and analytical approach. Further, prompting a specific sociodemographic identity is remarkably ineffective in aligning the model with the linguistic patterns observed in human writing from this identity group. This holds along the key dimensions of sex, race, first-generation status, and geographic location. The demographically prompted and unprompted synthetic texts were also more similar to each other than to the human text, meaning that prompting did not alleviate homogenization. These issues of model alignment and steerability in current LLMs raise concerns about the use of LLMs in high-stakes contexts.

Problem

Research questions and friction points this paper is trying to address.

Assessing LLM alignment with human writing styles

Evaluating steerability of LLMs via demographic prompts

Examining linguistic homogenization in LLM-generated texts

Innovation

Methods, ideas, or system contributions that make the work stand out.

Compare human and LLM-generated admission essays

Test alignment with demographic prompting

Analyze linguistic differences across key dimensions

🔎 Similar Papers

No similar papers found.