🤖 AI Summary
This study addresses the empirical research gap on non-English voice-based dark patterns, specifically investigating deceptive acoustic manipulations in Mandarin female synthetic speech. Method: A controlled psychological experiment was conducted, systematically varying five acoustic features (e.g., intonation, timbre) across shopping and Q&A scenarios; manipulations were implemented via precise parametric control, complemented by user perception ratings and behavioral response measurements. Contribution/Results: Findings reveal strong context-dependency and perceptual mediation of vocal manipulation effects: specific acoustic configurations significantly increased users’ behavioral compliance—up to 2027.6%—with effect magnitude modulated by scenario type, feature interactions, and individual perceptual variability. This work establishes the first empirically grounded theoretical framework and design guidelines for ethical governance of voice interfaces in Chinese-speaking contexts, thereby extending the scope of dark pattern research in human–computer interaction beyond visual and English-centric paradigms.
📝 Abstract
Pervasive voice interaction enables deceptive patterns through subtle voice characteristics, yet empirical investigation into this manipulation lags behind, especially within major non-English language contexts. Addressing this gap, our study presents the first systematic investigation into voice characteristic-based dark patterns employing female synthetic voices in Mandarin Chinese. This focus is crucial given the prevalence of female personas in commercial assistants and the prosodic significance in the Chinese language. Guided by the conceptual framework identifying key influencing factors, we systematically evaluate effectiveness variations by manipulating voice characteristics (five characteristics, three intensities) across different scenarios (shopping vs. question-answering) with different commercial aims. A preliminary study (N=24) validated the experimental materials and the main study (N=36) revealed significant behavioral manipulation (up to +2027.6%). Crucially, the analysis showed that effectiveness varied significantly with voice characteristics and scenario, mediated by user perception (of tone, intonation, timbre) and user demographics (individual preferences, though limited demographic impact). These interconnected findings offer evidence-based insights for ethical design.