🤖 AI Summary
Addressing core challenges in taxonomy dynamic expansion—namely, high candidate noise, context-length limitations, and weak generalization—this paper introduces Phylo-Reasoning, a phylogeny-guided reasoning paradigm that unifies discriminative ranking with generative hierarchical inference for the first time. Methodologically, it integrates BERT-based hierarchical ranking, hierarchy-aware prompting, dynamic chunking, and an iterative refinement reasoning pipeline to achieve efficient context utilization and structure-preserving expansion. Evaluated on four benchmark datasets, it outperforms 12 baselines significantly: classification accuracy improves by 12%, and Wu-Palmer semantic similarity increases by 5%. The model supports plug-and-play deployment and establishes a scalable, robust, and structure-aware paradigm for incremental taxonomy construction in large-scale, dynamically evolving domains.
📝 Abstract
Taxonomies are hierarchical knowledge graphs crucial for recommendation systems, and web applications. As data grows, expanding taxonomies is essential, but existing methods face key challenges: (1) discriminative models struggle with representation limits and generalization, while (2) generative methods either process all candidates at once, introducing noise and exceeding context limits, or discard relevant entities by selecting noisy candidates. We propose LORex ($ extbf{L}$ineage-$ extbf{O}$riented $ extbf{Re}$asoning for Taxonomy E$ extbf{x}$pansion), a plug-and-play framework that combines discriminative ranking and generative reasoning for efficient taxonomy expansion. Unlike prior methods, LORex ranks and chunks candidate terms into batches, filtering noise and iteratively refining selections by reasoning candidates' hierarchy to ensure contextual efficiency. Extensive experiments across four benchmarks and twelve baselines show that LORex improves accuracy by 12% and Wu&Palmer similarity by 5% over state-of-the-art methods.