What We Talk About When We Talk About LMs: Implicit Paradigm Shifts and the Ship of Language Models

📅 2024-07-02
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Existing studies on scientific term evolution overlook implicit semantic reconstruction—where nominal identity persists while conceptual content shifts, as exemplified by the “Ship of Theseus” paradox. Method: We introduce the novel concept of the “Language Model Ship” and construct a specialized corpus from ten years of top-tier NLP conference papers. Integrating quantitative text analysis, term co-occurrence modeling, and diachronic semantic tracking, we systematically trace how “language model” evolves. Contribution/Results: Empirical analysis reveals three distinct referential shifts over the decade—RNN-based → Transformer-based → LLM-based—tightly coupled with dominant architectural advances. This demonstrates that semantic drift is not noise but a core mechanism of scientific progress, wherein theoretical conceptualization and system implementation co-constitute discourse evolution. The study establishes implicit semantic reconstruction as a fundamental driver of scientific language change.

Technology Category

Application Category

📝 Abstract
The term Language Models (LMs) as a time-specific collection of models of interest is constantly reinvented, with its referents updated much like the $ extit{Ship of Theseus}$ replaces its parts but remains the same ship in essence. In this paper, we investigate this $ extit{Ship of Language Models}$ problem, wherein scientific evolution takes the form of continuous, implicit retrofits of key existing terms. We seek to initiate a novel perspective of scientific progress, in addition to the more well-studied emergence of new terms. To this end, we construct the data infrastructure based on recent NLP publications. Then, we perform a series of text-based analyses toward a detailed, quantitative understanding of the use of Language Models as a term of art. Our work highlights how systems and theories influence each other in scientific discourse, and we call for attention to the transformation of this Ship that we all are contributing to.
Problem

Research questions and friction points this paper is trying to address.

Analyze evolution of Language Models
Study implicit term retrofits
Quantify term usage in NLP
Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzes Language Models evolution
Constructs NLP data infrastructure
Performs text-based quantitative analyses
🔎 Similar Papers
No similar papers found.
Shengqi Zhu
Shengqi Zhu
School of Information Science, Cornell University
J
Jeffrey M. Rzeszotarski
Cornell University