Probability Distribution Collapse: A Critical Bottleneck to Compact Unsupervised Neural Grammar Induction

📅 2025-09-25

📈 Citations: 0

✨ Influential: 0

career value

157K/year

🤖 AI Summary

Existing unsupervised neural grammar induction models suffer from probability distribution collapse—a phenomenon that severely limits expressivity, yielding redundant, overly large grammars and poor parsing performance. Method: This paper formally characterizes distribution collapse and identifies its root cause in neural parameterization; it then proposes Collapse-Relaxed Neural Parameterization (CRNP), a novel framework that integrates gradient-aware regularization and structural constraints to induce compact, highly expressive hierarchical grammars across multiple languages. Contribution/Results: Experiments demonstrate that CRNP significantly reduces induced grammar size while matching or surpassing baseline performance on diverse parsing tasks—including constituency parsing and syntactic probing—across typologically diverse languages. These results validate CRNP’s effectiveness, robustness, and strong cross-lingual generalization capability.

Technology Category

Application Category

📝 Abstract

Unsupervised neural grammar induction aims to learn interpretable hierarchical structures from language data. However, existing models face an expressiveness bottleneck, often resulting in unnecessarily large yet underperforming grammars. We identify a core issue, $ extit{probability distribution collapse}$, as the underlying cause of this limitation. We analyze when and how the collapse emerges across key components of neural parameterization and introduce a targeted solution, $ extit{collapse-relaxing neural parameterization}$, to mitigate it. Our approach substantially improves parsing performance while enabling the use of significantly more compact grammars across a wide range of languages, as demonstrated through extensive empirical analysis.

Problem

Research questions and friction points this paper is trying to address.

Addresses probability distribution collapse in grammar induction

Mitigates expressiveness bottleneck in unsupervised neural models

Enables compact yet high-performing grammars across languages

Innovation

Methods, ideas, or system contributions that make the work stand out.

Collapse-relaxing neural parameterization mitigates probability distribution collapse

Analyzes collapse emergence across neural parameterization components

Enables compact grammars with improved parsing performance

🔎 Similar Papers

No similar papers found.