A Unified Taxonomy-Guided Instruction Tuning Framework for Entity Set Expansion and Taxonomy Expansion

📅 2024-02-20

🏛️ arXiv.org

📈 Citations: 5

✨ Influential: 0

career value

133K/year

🤖 AI Summary

Existing research treats entity set expansion, taxonomy expansion, and seed-guided taxonomy construction as disjoint tasks, resulting in poor method generalizability and a lack of unified modeling. Method: This paper proposes the first unified framework for all three taxonomy-related tasks, centered on collaborative learning of two structured reasoning skills—“finding sibling nodes” and “finding parent nodes”—enabled by taxonomy-guided instruction tuning, joint pretraining of both skills, and structure-aware prompting to foster skill complementarity and enhancement. Contribution/Results: The framework uncovers the shared skill foundation underlying diverse taxonomy tasks for the first time. It achieves state-of-the-art performance across all three tasks on multiple benchmarks, significantly improving generalizability and cross-task consistency while enabling unified modeling without task-specific architectural modifications.

Technology Category

Application Category

📝 Abstract

Entity set expansion, taxonomy expansion, and seed-guided taxonomy construction are three representative tasks that can be applied to automatically populate an existing taxonomy with emerging concepts. Previous studies view them as three separate tasks. Therefore, their proposed techniques usually work for one specific task only, lacking generalizability and a holistic perspective. In this paper, we aim at a unified solution to the three tasks. To be specific, we identify two common skills needed for entity set expansion, taxonomy expansion, and seed-guided taxonomy construction: finding"siblings"and finding"parents". We propose a taxonomy-guided instruction tuning framework to teach a large language model to generate siblings and parents for query entities, where the joint pre-training process facilitates the mutual enhancement of the two skills. Extensive experiments on multiple benchmark datasets demonstrate the efficacy of our proposed TaxoInstruct framework, which outperforms task-specific baselines across all three tasks.

Problem

Research questions and friction points this paper is trying to address.

Unified solution for entity and taxonomy expansion tasks

Teaching model to find siblings and parents for entities

Improving performance across multiple taxonomy-related benchmarks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified taxonomy-guided instruction tuning framework

Teaches model to generate siblings and parents

Joint pre-training enhances mutual skill improvement

🔎 Similar Papers

CodeTaxo: Enhancing Taxonomy Expansion with Limited Examples via Code Language Prompts