Rank, Chunk and Expand: Lineage-Oriented Reasoning for Taxonomy Expansion

📅 2025-05-19

📈 Citations: 0

✨ Influential: 0

career value

148K/year

🤖 AI Summary

Addressing core challenges in taxonomy dynamic expansion—namely, high candidate noise, context-length limitations, and weak generalization—this paper introduces Phylo-Reasoning, a phylogeny-guided reasoning paradigm that unifies discriminative ranking with generative hierarchical inference for the first time. Methodologically, it integrates BERT-based hierarchical ranking, hierarchy-aware prompting, dynamic chunking, and an iterative refinement reasoning pipeline to achieve efficient context utilization and structure-preserving expansion. Evaluated on four benchmark datasets, it outperforms 12 baselines significantly: classification accuracy improves by 12%, and Wu-Palmer semantic similarity increases by 5%. The model supports plug-and-play deployment and establishes a scalable, robust, and structure-aware paradigm for incremental taxonomy construction in large-scale, dynamically evolving domains.

Technology Category

Application Category

📝 Abstract

Taxonomies are hierarchical knowledge graphs crucial for recommendation systems, and web applications. As data grows, expanding taxonomies is essential, but existing methods face key challenges: (1) discriminative models struggle with representation limits and generalization, while (2) generative methods either process all candidates at once, introducing noise and exceeding context limits, or discard relevant entities by selecting noisy candidates. We propose LORex ($ extbf{L}$ineage-$ extbf{O}$riented $ extbf{Re}$asoning for Taxonomy E$ extbf{x}$pansion), a plug-and-play framework that combines discriminative ranking and generative reasoning for efficient taxonomy expansion. Unlike prior methods, LORex ranks and chunks candidate terms into batches, filtering noise and iteratively refining selections by reasoning candidates' hierarchy to ensure contextual efficiency. Extensive experiments across four benchmarks and twelve baselines show that LORex improves accuracy by 12% and Wu&Palmer similarity by 5% over state-of-the-art methods.

Problem

Research questions and friction points this paper is trying to address.

Expanding taxonomies efficiently with representation and generalization limits

Reducing noise and context overload in generative taxonomy expansion methods

Improving accuracy and hierarchy reasoning in taxonomy expansion frameworks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines discriminative ranking and generative reasoning

Ranks and chunks candidate terms into batches

Iteratively refines selections by hierarchy reasoning

🔎 Similar Papers

A Unified Taxonomy-Guided Instruction Tuning Framework for Entity Set Expansion and Taxonomy Expansion