๐ค AI Summary
To address the limited robustness of dependency parsing for free-word-order, morphologically rich, and low-resource languages, this paper proposes a word-order-agnostic contrastive self-supervised learning framework. It introduces contrastive learning (InfoNCE) to dependency parsing for the first time, removes positional encodings, employs word-order perturbation as a data augmentation strategy, and integrates morphological feature embeddings into a graph-based parserโthereby enhancing generalization to diverse word orders without requiring additional annotations. Leveraging the inherent word-order flexibility of such languages, the method is fully self-supervised. Experiments across seven free-word-order languages demonstrate average improvements of 3.03 in UAS and 2.95 in LAS over state-of-the-art baselines. The approach establishes a scalable, annotation-free paradigm for low-resource dependency parsing, significantly advancing robustness and cross-lingual applicability.
๐ Abstract
Neural dependency parsing has achieved remarkable performance for low resource morphologically rich languages. It has also been well-studied that morphologically rich languages exhibit relatively free word order. This prompts a fundamental investigation: Is there a way to enhance dependency parsing performance, making the model robust to word order variations utilizing the relatively free word order nature of morphologically rich languages? In this work, we examine the robustness of graph-based parsing architectures on 7 relatively free word order languages. We focus on scrutinizing essential modifications such as data augmentation and the removal of position encoding required to adapt these architectures accordingly. To this end, we propose a contrastive self-supervised learning method to make the model robust to word order variations. Furthermore, our proposed modification demonstrates a substantial average gain of 3.03/2.95 points in 7 relatively free word order languages, as measured by the UAS/LAS Score metric when compared to the best performing baseline.