Geometric Signatures of Compositionality Across a Language Model's Lifetime

📅 2024-10-02

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This work investigates how language models acquire and represent linguistic compositionality. Method: We propose intrinsic dimensionality (ID) as a geometric proxy to quantify the degree of compositionality in datasets and its evolution within model representation spaces. Using ID estimation, representational geometry analysis, training trajectory tracking, and linear/nonlinear subspace decomposition, we systematically analyze how compositionality manifests geometrically during training. Contribution/Results: We establish, for the first time, that ID reliably measures linguistic compositionality: ID decreases systematically with increasing compositionality; semantic composition is encoded in nonlinear subspaces, while surface-level syntactic structure resides in linear subspaces; ID dynamics reflect compositionality-driven feature learning phases and strongly correlate with the abstractness of learned linguistic rules. These findings uncover the geometric mechanisms underlying the dynamic emergence of compositionality during training and introduce a novel paradigm for understanding neural models’ abstraction capabilities.

Technology Category

Application Category

📝 Abstract

By virtue of linguistic compositionality, few syntactic rules and a finite lexicon can generate an unbounded number of sentences. That is, language, though seemingly high-dimensional, can be explained using relatively few degrees of freedom. An open question is whether contemporary language models (LMs) reflect the intrinsic simplicity of language that is enabled by compositionality. We take a geometric view of this problem by relating the degree of compositionality in a dataset to the intrinsic dimension (ID) of its representations under an LM, a measure of feature complexity. We find not only that the degree of dataset compositionality is reflected in representations' ID, but that the relationship between compositionality and geometric complexity arises due to learned linguistic features over training. Finally, our analyses reveal a striking contrast between nonlinear and linear dimensionality, showing they respectively encode semantic and superficial aspects of linguistic composition.

Problem

Research questions and friction points this paper is trying to address.

Analyzing compositionality in language models

Relating dataset compositionality to intrinsic dimension

Contrasting nonlinear and linear dimensionality in encoding

Innovation

Methods, ideas, or system contributions that make the work stand out.

Geometric analysis of compositionality

Intrinsic dimension measurement

Nonlinear vs linear encoding

🔎 Similar Papers

Emergence of a High-Dimensional Abstraction Phase in Language Transformers