🤖 AI Summary
This work addresses the limited robustness of existing lexical semantic change detection methods, which often rely on metrics such as APD and PRT and exhibit instability across different models and representation spaces. To overcome these limitations, the authors propose a novel metric—Average Minimum Distance (AMD)—and its symmetric variant, SAMD, grounded in local usage correspondences, thereby departing from conventional paradigms to enhance adaptability and stability. The approach leverages contextualized language model embeddings and is systematically evaluated using dimensionality reduction techniques, both specialized and non-specialized encoders, and multilingual corpora. Experimental results demonstrate that AMD outperforms baseline metrics particularly in settings involving non-specialized encoders and dimensionality reduction, while SAMD shows marked superiority with specialized encoders, collectively confirming the effectiveness and advantages of the proposed metrics across diverse multilingual and architectural configurations.
📝 Abstract
Lexical semantic change detection (LSCD) increasingly relies on contextualised language model embeddings, yet most approaches still quantify change using a small set of semantic change metrics, primarily Average Pairwise Distance (APD) and cosine distance over word prototypes (PRT). We introduce Average Minimum Distance (AMD) and Symmetric Average Minimum Distance (SAMD), new measures that quantify semantic change via local correspondence between word usages across time periods. Across multiple languages, encoder models, and representation spaces, we show that AMD often provides more robust performance, particularly under dimensionality reduction and with non-specialised encoders, while SAMD excels with specialised encoders. We suggest that LSCD may benefit from considering alternative semantic change metrics beyond APD and PRT, with AMD offering a robust option for contextualised embedding-based analysis.