Accent Vector: Controllable Accent Manipulation for Multilingual TTS Without Accented Data

📅 2026-03-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current text-to-speech (TTS) systems struggle to model non-native accents, primarily due to the scarcity of accented speech data. This work proposes a novel approach that fine-tunes a multilingual TTS model on native non-English speech to extract task-specific accent vectors that capture distinctive accent characteristics. For the first time, this method enables fine-grained and composable cross-lingual accent control without requiring any accented training data. It supports continuous adjustment of accent strength and synthesis of mixed accents. Experimental results demonstrate that the proposed technique achieves precise and flexible manipulation of multilingual accents, as evidenced by both objective metrics and human evaluations.

Technology Category

Application Category

📝 Abstract
Accent is an integral part of society, reflecting multiculturalism and shaping how individuals express identity. The majority of English speakers are non-native (L2) speakers, yet current Text-To-Speech (TTS) systems primarily model American-accented English due limited accented data. We propose \textit{Accent Vector}, a controllable representation that enables accent manipulation in multilingual TTS without requiring accented training data. \textit{Accent Vector} is derived by fine-tuning a TTS system on native speech of a different language (i.e. non-English) and computing task vectors capturing accent characteristics (i.e. in English). By scaling and interpolating the vector, we achieve fine-grained control over accent strength and generate mixed-accent speech. In addition, it generalizes beyond English, enabling accent control across multiple languages. Objective and human evaluations confirm the effectiveness of Accent Vector for fine-grained and compositional accent control.
Problem

Research questions and friction points this paper is trying to address.

accent manipulation
multilingual TTS
accented speech
non-native speakers
controllable accent
Innovation

Methods, ideas, or system contributions that make the work stand out.

Accent Vector
controllable accent manipulation
multilingual TTS
task vector
accent generalization
🔎 Similar Papers
No similar papers found.
T
Thanathai Lertpetchpun
Signal Analysis and Interpretation Lab, University of Southern California, USA
T
Thanapat Trachu
Thomas Lord Department of Computer Science, University of Southern California, USA
Jihwan Lee
Jihwan Lee
PhD Student, Signal Analysis and Interpretation Lab (SAIL) at University of Southern California
brain-computer interfacesspeech synthesisbiosignal-to-speecharticulatory phonetics
Tiantian Feng
Tiantian Feng
Postdoc Researcher
Health and BehaviorsWearable ComputingAffective ComputingSpeech and BiosignalResponsible ML
D
Dani Byrd
Department of Linguistics, University of Southern California
S
Shrikanth Narayanan
Signal Analysis and Interpretation Lab, University of Southern California, USA; Thomas Lord Department of Computer Science, University of Southern California, USA; Department of Linguistics, University of Southern California