Building a Custom Taxonomy of AI Skills and Tasks from the Ground Up with Job Postings

📅 2026-05-20

📈 Citations: 0

✨ Influential: 0

career value

158K/year

🤖 AI Summary

This work addresses the challenge of efficiently constructing a comprehensive and well-structured taxonomy of artificial intelligence skills and tasks from massive hiring data. To this end, the authors propose TaxonomyBuilder, a framework that integrates systematic data filtering, clustering algorithms, and large language model–enhanced hierarchical label generation to automatically derive domain-specific taxonomies from curated, high-quality data subsets. Experimental results demonstrate that taxonomies built from filtered data exhibit significantly broader coverage and superior structural coherence compared to those generated from raw, unfiltered data using existing methods. The study thus establishes a novel paradigm for data-driven, automated taxonomy construction in specialized domains.

📝 Abstract

Utilizing LLMs for automated taxonomy construction presents a clear opportunity for the comprehensive, yet efficient mapping of potentially complex domains. When contending with high volumes of rapidly growing corpora, however, it becomes unclear how to best leverage such data for optimal taxonomy construction. Taking the case of systematizing AI skills in the workplace, we use two large-scale job postings corpora to investigate key design decisions for the inclusion (or exclusion) of data points for taxonomy construction. We propose TaxonomyBuilder as a blueprint for our systematic study, with which we evaluate various configurations of custom, data-informed, and hierarchical taxonomies. We demonstrate that less data can provide more clarity: filtering inputs to TaxonomyBuilder provides better domain-specific coverage than offering unfiltered inputs to clustering and LLM-enhanced hierarchical taxonomy labeling tools.

Problem

Research questions and friction points this paper is trying to address.

AI skills

taxonomy construction

job postings

data filtering

hierarchical taxonomy

Innovation

Methods, ideas, or system contributions that make the work stand out.

TaxonomyBuilder

AI skills taxonomy

job postings