Learning-free L2-Accented Speech Generation using Phonological Rules

📅 2026-03-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a phoneme-level accented speech synthesis method that requires no accented training data, addressing the limitations of existing text-to-speech (TTS) systems which either rely on large-scale accented corpora or lack fine-grained phonemic control. By applying interpretable phonological rules to the phoneme sequences of a multilingual TTS model, the approach enables explicit modeling and manipulation of second-language accents. To the best of our knowledge, this is the first demonstration of zero-shot, phoneme-level accent transfer without additional learning. Experiments generating Spanish- and Indian-accented English show that the method effectively preserves speech intelligibility while maintaining high audio quality, thereby validating the efficacy and controllability of phonological rules in cross-lingual accent modeling.

Technology Category

Application Category

📝 Abstract
Accent plays a crucial role in speaker identity and inclusivity in speech technologies. Existing accented text-to-speech (TTS) systems either require large-scale accented datasets or lack fine-grained phoneme-level controllability. We propose a accented TTS framework that combines phonological rules with a multilingual TTS model. The rules are applied to phoneme sequences to transform accent at the phoneme level while preserving intelligibility. The method requires no accented training data and enables explicit phoneme-level accent manipulation. We design rule sets for Spanish- and Indian-accented English, modeling systematic differences in consonants, vowels, and syllable structure arising from phonotactic constraints. We analyze the trade-off between phoneme-level duration alignment and accent as realized in speech timing. Experimental results demonstrate effective accent shift while maintaining speech quality.
Problem

Research questions and friction points this paper is trying to address.

accented speech synthesis
phoneme-level control
text-to-speech
speaker identity
speech inclusivity
Innovation

Methods, ideas, or system contributions that make the work stand out.

phonological rules
accented TTS
learning-free
phoneme-level control
L2-accent generation
🔎 Similar Papers
No similar papers found.
T
Thanathai Lertpetchpun
Signal Analysis and Interpretation Lab, University of Southern California, USA
Y
Yoonjeong Lee
Signal Analysis and Interpretation Lab, University of Southern California, USA
Jihwan Lee
Jihwan Lee
PhD Student, Signal Analysis and Interpretation Lab (SAIL) at University of Southern California
brain-computer interfacesspeech synthesisbiosignal-to-speecharticulatory phonetics
Tiantian Feng
Tiantian Feng
Postdoc Researcher
Health and BehaviorsWearable ComputingAffective ComputingSpeech and BiosignalResponsible ML
D
Dani Byrd
Department of Linguistics, University of Southern California
S
Shrikanth Narayanan
Signal Analysis and Interpretation Lab, University of Southern California, USA; Department of Linguistics, University of Southern California