LISTN: Lexicon induction with socio-temporal nuance

📅 2024-09-28

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

Existing methods fail to model the coupled evolution of language and social structure within online subcultures (e.g., anti-feminist communities), hindering the generation of socially and temporally grounded intra-community lexicons. To address this, we propose the first unified framework that jointly models dynamic word embeddings and user embeddings, enabling quantitative characterization of term–subgroup–time triadic associations. Our approach integrates a dynamic graph neural network with temporal contrastive learning to co-optimize user–word representations. We construct the first linguist-validated, temporally annotated lexicon of the “men’s rights” ecosystem—comprising over one thousand evolving terms with time-varying weights—and a corresponding benchmark evaluation set. Experiments demonstrate significant improvements over state-of-the-art baselines across multiple subcommunities. Furthermore, our analysis uncovers causal patterns linking intra-group linguistic diffusion to identity polarization.

Technology Category

Application Category

📝 Abstract

In-group language is an important signifier of group dynamics. This paper proposes a novel method for inducing lexicons of in-group language, which incorporates its socio-temporal context. Existing methods for lexicon induction do not capture the evolving nature of in-group language, nor the social structure of the community. Using dynamic word and user embeddings trained on conversations from online anti-women communities, our approach outperforms prior methods for lexicon induction. We develop a test set for the task of lexicon induction and a new lexicon of manosphere language, validated by human experts, which quantifies the relevance of each term to a specific sub-community at a given point in time. Finally, we present novel insights on in-group language which illustrate the utility of this approach.

Problem

Research questions and friction points this paper is trying to address.

Inducing lexicons of evolving in-group language

Incorporating socio-temporal context in lexicon induction

Capturing community social structure in language analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic word and user embeddings

Socio-temporal context incorporation

Human-validated lexicon quantification

🔎 Similar Papers

To Word Senses and Beyond: Inducing Concepts with Contextualized Language Models