Dispersion Measures as Predictors of Lexical Decision Time, Word Familiarity, and Lexical Complexity

📅 2025-01-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study systematically evaluates the cross-linguistic predictive validity of term dispersion metrics for lexical processing time, familiarity, and complexity. Using corpora from five languages, we employ multivariate linear regression and granularity-controlled experiments to compare log-range, log-frequency, and established dispersion measures (e.g., DP, VMR). Results demonstrate that log-range significantly outperforms log-frequency and consistently surpasses all other dispersion metrics across languages and tasks; as a complementary predictor to word frequency, it exhibits the strongest explanatory power. These findings indicate that simple, interpretable dispersion metrics—not complex computational models—possess superior psycholinguistic validity. Log-range thus provides a unified account of previously inconsistent findings regarding dispersion effects, resolving theoretical contradictions in the literature. By offering a robust, theoretically grounded metric, this work advances both empirical research on lexical representation and computational modeling of lexical access.

Technology Category

Application Category

📝 Abstract
Various measures of dispersion have been proposed to paint a fuller picture of a word's distribution in a corpus, but only little has been done to validate them externally. We evaluate a wide range of dispersion measures as predictors of lexical decision time, word familiarity, and lexical complexity in five diverse languages. We find that the logarithm of range is not only a better predictor than log-frequency across all tasks and languages, but that it is also the most powerful additional variable to log-frequency, consistently outperforming the more complex dispersion measures. We discuss the effects of corpus part granularity and logarithmic transformation, shedding light on contradictory results of previous studies.
Problem

Research questions and friction points this paper is trying to address.

Lexical Processing Time
Linguistic Complexity
Cross-linguistic Analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Log Range
Predictive Accuracy
Corpus Detail
🔎 Similar Papers
No similar papers found.