๐ค AI Summary
Unsupervised syntactic chunking remains challenging due to the absence of explicit structural annotations.
Method: We propose an end-to-end hierarchical RNN (HRNN)-based approach that models the non-hierarchical compositional structure word โ chunk โ sentence via a two-stage paradigm: (1) unsupervised pretraining using neural parsing to implicitly learn chunking structure, followed by (2) joint fine-tuning on downstream tasks (e.g., CoNLL-2000 chunking).
Contribution/Results: To our knowledge, this is the first work demonstrating spontaneous emergence of chunking structure in HRNNs under purely unsupervised settings; we further identify its transient emergence propertyโoffering novel theoretical insights into unsupervised grammar induction. Experiments show our method significantly outperforms prior unsupervised approaches on CoNLL-2000, achieving a +6.0-point improvement in phrase-level F1. Subsequent fine-tuning yields further gains, empirically validating the synergistic interplay between emergent structural learning and task-specific adaptation.
๐ Abstract
In Natural Language Processing (NLP), predicting linguistic structures, such as parsing and chunking, has mostly relied on manual annotations of syntactic structures. This paper introduces an unsupervised approach to chunking, a syntactic task that involves grouping words in a non-hierarchical manner. We present a two-layer Hierarchical Recurrent Neural Network (HRNN) designed to model word-to-chunk and chunk-to-sentence compositions. Our approach involves a two-stage training process: pretraining with an unsupervised parser and finetuning on downstream NLP tasks. Experiments on the CoNLL-2000 dataset reveal a notable improvement over existing unsupervised methods, enhancing phrase F1 score by up to 6 percentage points. Further, finetuning with downstream tasks results in an additional performance improvement. Interestingly, we observe that the emergence of the chunking structure is transient during the neural model's downstream-task training. This study contributes to the advancement of unsupervised syntactic structure discovery and opens avenues for further research in linguistic theory.