🤖 AI Summary
Constructing fine-grained taxonomies in food science is labor-intensive and error-prone due to domain complexity and sparse annotation. Method: We propose an iterative, LLM-based approach for automatic taxonomy generation and completion, supporting two initialization modes—progressive expansion from a seed taxonomy or bottom-up hierarchical construction from scattered concepts. Crucially, we design a taxonomy-aware iterative prompting strategy integrating chain-of-thought reasoning, self-consistency verification, and explicit hierarchical constraints to jointly infer “concept → parent → path.” Results: Evaluated on five real-world food classification systems using Llama-3, our method achieves significant gains in zero-shot hierarchical classification accuracy. It is the first work to systematically demonstrate the feasibility of LLMs for fine-grained, domain-specific taxonomy generation, establishing a novel paradigm for automated knowledge graph construction in specialized domains.
📝 Abstract
We investigate the utility of Large Language Models for automated taxonomy generation and completion specifically applied to taxonomies from the food technology industry. We explore the extent to which taxonomies can be completed from a seed taxonomy or generated without a seed from a set of known concepts, in an iterative fashion using recent prompting techniques. Experiments on five taxonomies using an open-source LLM (Llama-3), while promising, point to the difficulty of correctly placing inner nodes.