🤖 AI Summary
This study investigates how AI-writing disclosure affects textual quality evaluation and whether author race and gender moderate this effect. Method: We conduct a large-scale controlled experiment employing human raters (N=1,970) and large language model (LLM) raters (N=2,520). Contribution/Results: We document, for the first time, that in the absence of disclosure, LLM raters exhibit significant preference for texts authored by women and Black individuals; however, upon AI-assistance disclosure, this advantage vanishes entirely—constituting a “preference reversal.” Human raters also penalize disclosure but show no identity-dependent reversal. These findings reveal that AI transparency policies may inadvertently exacerbate evaluative inequity: for marginalized authors, adherence to honesty norms diminishes their relative advantage. The study provides the first empirical evidence of implicit identity bias in LLM-based assessment and its dynamic attenuation under disclosure conditions—offering critical insights for algorithmic fairness and AI ethics governance.
📝 Abstract
As AI integrates in various types of human writing, calls for transparency around AI assistance are growing. However, if transparency operates on uneven ground and certain identity groups bear a heavier cost for being honest, then the burden of openness becomes asymmetrical. This study investigates how AI disclosure statement affects perceptions of writing quality, and whether these effects vary by the author's race and gender. Through a large-scale controlled experiment, both human raters (n = 1,970) and LLM raters (n = 2,520) evaluated a single human-written news article while disclosure statements and author demographics were systematically varied. This approach reflects how both human and algorithmic decisions now influence access to opportunities (e.g., hiring, promotion) and social recognition (e.g., content recommendation algorithms). We find that both human and LLM raters consistently penalize disclosed AI use. However, only LLM raters exhibit demographic interaction effects: they favor articles attributed to women or Black authors when no disclosure is present. But these advantages disappear when AI assistance is revealed. These findings illuminate the complex relationships between AI disclosure and author identity, highlighting disparities between machine and human evaluation patterns.