An Effective Incorporating Heterogeneous Knowledge Curriculum Learning for Sequence Labeling

📅 2024-02-21
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Integrating external knowledge into sequence labeling models often introduces data heterogeneity and model complexity, leading to high training costs and slow convergence. To address this, we propose a two-stage curriculum learning (TCL) framework tailored for heterogeneous knowledge fusion: Stage I focuses on foundational sequence modeling without external knowledge; Stage II progressively incorporates heterogeneous external knowledge via a task-adapted, multi-dimensional difficulty assessment and adaptive sampling mechanism. Evaluated on six Chinese and English word segmentation (CWS) and part-of-speech (POS) tagging datasets, TCL consistently improves accuracy while accelerating training convergence and mitigating optimization difficulties inherent in complex knowledge-augmented models. Our key contribution is the first systematic application of curriculum learning to heterogeneous knowledge-enhanced sequence labeling, enabling difficulty-aware, dynamic knowledge integration.

Technology Category

Application Category

📝 Abstract
Sequence labeling models often benefit from incorporating external knowledge. However, this practice introduces data heterogeneity and complicates the model with additional modules, leading to increased expenses for training a high-performing model. To address this challenge, we propose a two-stage curriculum learning (TCL) framework specifically designed for sequence labeling tasks. The TCL framework enhances training by gradually introducing data instances from easy to hard, aiming to improve both performance and training speed. Furthermore, we explore different metrics for assessing the difficulty levels of sequence labeling tasks. Through extensive experimentation on six Chinese word segmentation (CWS) and Part-of-speech tagging (POS) datasets, we demonstrate the effectiveness of our model in enhancing the performance of sequence labeling models. Additionally, our analysis indicates that TCL accelerates training and alleviates the slow training problem associated with complex models.
Problem

Research questions and friction points this paper is trying to address.

Incorporating external knowledge causes data heterogeneity.
Complex models with extra modules increase training costs.
Slow training speed in sequence labeling tasks.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage curriculum learning framework
Gradual data introduction strategy
Difficulty metrics for sequence labeling
🔎 Similar Papers
No similar papers found.