🤖 AI Summary
This study investigates whether articulatory strategies—specifically tongue movement patterns—exhibit sufficient speaker-specificity to support speaker identification. Using midsagittal ultrasound tongue contour data from 40 native English speakers, tongue shapes were standardized via Generalized Procrustes Analysis, and orthogonalized size and shape features were extracted. Discriminative performance was evaluated within a likelihood-ratio decision framework. Three principal contributions emerge: (1) Tongue size is the most discriminative single feature; (2) Anterior tongue shape variation yields significantly higher speaker specificity than posterior variation; (3) After controlling for covariates—including age, sex, and vocal tract length—pure shape features achieve identification accuracy comparable to combined size–shape features. This work provides the first systematic evidence that tongue size constitutes a viable biometric cue, establishing a novel paradigm for speech-based biometric authentication.
📝 Abstract
The way speakers articulate is well known to be variable across individuals while at the same time subject to anatomical and biomechanical constraints. In this study, we ask whether articulatory strategy in vowel production can be sufficiently speaker-specific to form the basis for speaker discrimination. We conducted Generalised Procrustes Analyses of tongue shape data from 40 English speakers from the North West of England, and assessed the speaker-discriminatory potential of orthogonal tongue shape features within the framework of likelihood ratios. Tongue size emerged as the individual dimension with the strongest discriminatory power, while tongue shape variation in the more anterior part of the tongue generally outperformed tongue shape variation in the posterior part. When considered in combination, shape-only information may offer comparable levels of speaker specificity to size-and-shape information, but only when features do not exhibit speaker-level co-variation.