A Latent-Variable Model for Intrinsic Probing

📅 2022-01-20

🏛️ AAAI Conference on Artificial Intelligence

📈 Citations: 4

✨ Influential: 2

career value

179K/year

🤖 AI Summary

Existing probing methods suffer from high bias in mutual information estimation and coarse-grained localization, limiting precise analysis of where and how linguistic knowledge (e.g., morphology, syntax) is encoded in pretrained contextualized representations. Method: We propose the first latent-variable-based intrinsic probing framework, deriving a computationally tractable variational lower bound on mutual information to enable tighter, more accurate estimation. Contribution/Results: Our method reveals cross-lingual entanglement between morphological and syntactic representations in multilingual pretrained models—the first such finding. It significantly outperforms baselines on multilingual probing tasks, achieving superior robustness and fine-grained localization without relying on external annotations or task-specific fine-tuning. Crucially, it establishes a theoretical bridge between interpretability analysis and information-theoretic modeling, offering a novel paradigm for representation disentanglement and cross-lingual structural discovery.

📝 Abstract

The success of pre-trained contextualized representations has prompted researchers to analyze them for the presence of linguistic information. Indeed, it is natural to assume that these pre-trained representations do encode some level of linguistic knowledge as they have brought about large empirical improvements on a wide variety of NLP tasks, which suggests they are learning true linguistic generalization. In this work, we focus on intrinsic probing, an analysis technique where the goal is not only to identify whether a representation encodes a linguistic attribute but also to pinpoint where this attribute is encoded. We propose a novel latent-variable formulation for constructing intrinsic probes and derive a tractable variational approximation to the log-likelihood. Our results show that our model is versatile and yields tighter mutual information estimates than two intrinsic probes previously proposed in the literature. Finally, we find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.

Problem

Research questions and friction points this paper is trying to address.

Analyzing linguistic information in pre-trained representations

Identifying where linguistic attributes are encoded

Developing cross-lingually entangled morphosyntax understanding

Innovation

Methods, ideas, or system contributions that make the work stand out.

Latent-variable model for intrinsic probing

Tractable variational approximation to log-likelihood

Cross-lingually entangled morphosyntax analysis

🔎 Similar Papers

No similar papers found.