What makes a word hard to learn? Modeling L1 influence on English vocabulary difficulty

📅 2026-05-12
📈 Citations: 0
Influential: 0
📄 PDF

career value

190K/year
🤖 AI Summary
This study investigates how native language background influences the difficulty of learning English vocabulary, focusing on speakers of Spanish, German, and Chinese. By constructing a gradient boosting model that integrates features such as word frequency, semantic properties, surface form characteristics, and cross-linguistic transfer effects—and employing Shapley values for interpretability—the research provides the first systematic quantification of native-language-specific transfer effects on English lexical difficulty. The findings reveal that native Chinese speakers exhibit a distinct transfer pattern due to the absence of orthographic similarity with English. The proposed approach enables the generation of interpretable, native-language-tailored vocabulary difficulty estimates, offering robust support for personalized vocabulary instruction.
📝 Abstract
What makes a word difficult to learn, and how does the difficulty depend on the learner's native language? We computationally model vocabulary difficulty for English learners whose first language is Spanish, German, or Chinese with gradient-boosted models trained on features related to a word's familiarity (e.g., frequency), meaning, surface form, and cross-linguistic transfer. Using Shapley values, we determine the importance of each feature group. Word familiarity is the dominant feature group shared by all three languages. However, predictions for Spanish- and German-speaking learners rely additionally on orthographic transfer. This transfer mechanism is unavailable to Chinese learners, whose difficulty is shaped by a combination of familiarity and surface features alone. Our models provide interpretable, L1-tailored difficulty estimates that can be used to design vocabulary curricula.
Problem

Research questions and friction points this paper is trying to address.

vocabulary difficulty
L1 influence
cross-linguistic transfer
second language acquisition
word learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

cross-linguistic transfer
vocabulary difficulty modeling
L1 influence
gradient-boosted models
Shapley values
🔎 Similar Papers
No similar papers found.