CSSL: Contrastive Self-Supervised Learning for Dependency Parsing on Relatively Free Word Ordered and Morphologically Rich Low Resource Languages

📅 2024-10-09

🏛️ Conference on Empirical Methods in Natural Language Processing

📈 Citations: 0

✨ Influential: 0

career value

133K/year

🤖 AI Summary

To address the limited robustness of dependency parsing for free-word-order, morphologically rich, and low-resource languages, this paper proposes a word-order-agnostic contrastive self-supervised learning framework. It introduces contrastive learning (InfoNCE) to dependency parsing for the first time, removes positional encodings, employs word-order perturbation as a data augmentation strategy, and integrates morphological feature embeddings into a graph-based parser—thereby enhancing generalization to diverse word orders without requiring additional annotations. Leveraging the inherent word-order flexibility of such languages, the method is fully self-supervised. Experiments across seven free-word-order languages demonstrate average improvements of 3.03 in UAS and 2.95 in LAS over state-of-the-art baselines. The approach establishes a scalable, annotation-free paradigm for low-resource dependency parsing, significantly advancing robustness and cross-lingual applicability.

Technology Category

Application Category

📝 Abstract

Neural dependency parsing has achieved remarkable performance for low resource morphologically rich languages. It has also been well-studied that morphologically rich languages exhibit relatively free word order. This prompts a fundamental investigation: Is there a way to enhance dependency parsing performance, making the model robust to word order variations utilizing the relatively free word order nature of morphologically rich languages? In this work, we examine the robustness of graph-based parsing architectures on 7 relatively free word order languages. We focus on scrutinizing essential modifications such as data augmentation and the removal of position encoding required to adapt these architectures accordingly. To this end, we propose a contrastive self-supervised learning method to make the model robust to word order variations. Furthermore, our proposed modification demonstrates a substantial average gain of 3.03/2.95 points in 7 relatively free word order languages, as measured by the UAS/LAS Score metric when compared to the best performing baseline.

Problem

Research questions and friction points this paper is trying to address.

Enhancing dependency parsing for morphologically rich languages

Improving robustness to word order variations

Addressing low resource language parsing challenges

Innovation

Methods, ideas, or system contributions that make the work stand out.

Contrastive self-supervised learning for parsing

Data augmentation for word order robustness

Removal of position encoding in architectures

🔎 Similar Papers

No similar papers found.