🤖 AI Summary
This study identifies systemic accent bias in AI speech synthesis technologies and introduces “digital accent exclusion”—a novel form of digital inequality. Focusing on mainstream platforms including Speechify and ElevenLabs, we employ a mixed-methods approach: cross-accent speech quality evaluation, user surveys and in-depth interviews, and technical performance benchmarking—assessing synthesis fidelity and intelligibility across five non-dominant English accents. Results empirically demonstrate that accent-related performance degradation leads to audible distortion, triggering user identity alienation and service avoidance. Building on these findings, we propose a socio-technical redefinition of AI fairness, formally conceptualizing digital accent exclusion and advancing actionable, inclusive pathways for algorithmic design, accent-diverse data curation, and regulatory policy. This work bridges technical evaluation with sociolinguistic equity, offering both empirical evidence and a normative framework for mitigating accent-based discrimination in voice AI systems.
📝 Abstract
Recent advances in artificial intelligence (AI) speech generation and voice cloning technologies have produced naturalistic speech and accurate voice replication, yet their influence on sociotechnical systems across diverse accents and linguistic traits is not fully understood. This study evaluates two synthetic AI voice services (Speechify and ElevenLabs) through a mixed methods approach using surveys and interviews to assess technical performance and uncover how users' lived experiences influence their perceptions of accent variations in these speech technologies. Our findings reveal technical performance disparities across five regional, English-language accents and demonstrate how current speech generation technologies may inadvertently reinforce linguistic privilege and accent-based discrimination, potentially creating new forms of digital exclusion. Overall, our study highlights the need for inclusive design and regulation by providing actionable insights for developers, policymakers, and organizations to ensure equitable and socially responsible AI speech technologies.