Hierarchical Job Classification with Similarity Graph Integration

πŸ“… 2025-07-14
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address low classification accuracy, difficulty in modeling hierarchical industry-category relationships, and cold-start challenges in online job recruitment, this paper proposes Caroteneβ€”a semantic-enhanced representation learning framework that jointly leverages a hierarchical taxonomy and a job similarity graph structure. Carotene is the first approach to simultaneously encode hierarchical constraints of occupational classification systems and graph-structured relational semantics, embedding both jobs and categories into a shared latent space. It introduces a hierarchical classification loss and integrates graph neural networks for end-to-end optimization. Evaluated on a large-scale real-world job dataset, Carotene significantly outperforms state-of-the-art baselines, achieving substantial gains in classification accuracy. Empirical results demonstrate its strong capability in capturing both hierarchical semantics and structural dependencies. The framework establishes a scalable, robust paradigm for dynamic job matching, personalized job recommendation, and labor market analytics.

Technology Category

Application Category

πŸ“ Abstract
In the dynamic realm of online recruitment, accurate job classification is paramount for optimizing job recommendation systems, search rankings, and labor market analyses. As job markets evolve, the increasing complexity of job titles and descriptions necessitates sophisticated models that can effectively leverage intricate relationships within job data. Traditional text classification methods often fall short, particularly due to their inability to fully utilize the hierarchical nature of industry categories. To address these limitations, we propose a novel representation learning and classification model that embeds jobs and hierarchical industry categories into a latent embedding space. Our model integrates the Standard Occupational Classification (SOC) system and an in-house hierarchical taxonomy, Carotene, to capture both graph and hierarchical relationships, thereby improving classification accuracy. By embedding hierarchical industry categories into a shared latent space, we tackle cold start issues and enhance the dynamic matching of candidates to job opportunities. Extensive experimentation on a large-scale dataset of job postings demonstrates the model's superior ability to leverage hierarchical structures and rich semantic features, significantly outperforming existing methods. This research provides a robust framework for improving job classification accuracy, supporting more informed decision-making in the recruitment industry.
Problem

Research questions and friction points this paper is trying to address.

Improving job classification accuracy in dynamic online recruitment
Addressing limitations of traditional text classification methods
Enhancing candidate-job matching with hierarchical industry categories
Innovation

Methods, ideas, or system contributions that make the work stand out.

Embedding jobs into latent space
Integrating SOC and Carotene taxonomies
Leveraging hierarchical semantic features
πŸ”Ž Similar Papers
No similar papers found.