Learning-free L2-Accented Speech Generation using Phonological Rules

📅 2026-03-08

📈 Citations: 0

✨ Influential: 0

career value

159K/year

🤖 AI Summary

This work proposes a phoneme-level accented speech synthesis method that requires no accented training data, addressing the limitations of existing text-to-speech (TTS) systems which either rely on large-scale accented corpora or lack fine-grained phonemic control. By applying interpretable phonological rules to the phoneme sequences of a multilingual TTS model, the approach enables explicit modeling and manipulation of second-language accents. To the best of our knowledge, this is the first demonstration of zero-shot, phoneme-level accent transfer without additional learning. Experiments generating Spanish- and Indian-accented English show that the method effectively preserves speech intelligibility while maintaining high audio quality, thereby validating the efficacy and controllability of phonological rules in cross-lingual accent modeling.

Technology Category

Application Category

📝 Abstract

Accent plays a crucial role in speaker identity and inclusivity in speech technologies. Existing accented text-to-speech (TTS) systems either require large-scale accented datasets or lack fine-grained phoneme-level controllability. We propose a accented TTS framework that combines phonological rules with a multilingual TTS model. The rules are applied to phoneme sequences to transform accent at the phoneme level while preserving intelligibility. The method requires no accented training data and enables explicit phoneme-level accent manipulation. We design rule sets for Spanish- and Indian-accented English, modeling systematic differences in consonants, vowels, and syllable structure arising from phonotactic constraints. We analyze the trade-off between phoneme-level duration alignment and accent as realized in speech timing. Experimental results demonstrate effective accent shift while maintaining speech quality.

Problem

Research questions and friction points this paper is trying to address.

accented speech synthesis

phoneme-level control

text-to-speech

speaker identity

speech inclusivity

Innovation

Methods, ideas, or system contributions that make the work stand out.

phonological rules

accented TTS

learning-free