LABOR-LLM: Language-Based Occupational Representations with Large Language Models

📅 2024-06-25
🏛️ arXiv.org
📈 Citations: 3
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited representational capacity of conventional resume encoders in occupational transition prediction. We propose a novel paradigm for semantic occupational representation grounded in large language models (LLMs). Methodologically, structured career history data is converted into resume-like textual sequences, and small-to-medium-sized LLMs are fine-tuned via next-token prediction as a pretraining objective; the resulting embeddings are integrated into downstream transition prediction models. Our key contributions are twofold: first, we pioneer the replacement of traditional Transformer-based resume encoders with LLMs as the foundational representation mechanism; second, we empirically demonstrate that compact LLMs—when fine-tuned on diverse occupational data—outperform larger counterparts. Experiments on predicting workers’ next occupations show significant improvements over state-of-the-art baselines including CAREER, validating the effectiveness, robustness, and scalability of language-based occupational representation.

Technology Category

Application Category

📝 Abstract
Vafa et al. (2024) introduced a transformer-based econometric model, CAREER, that predicts a worker's next job as a function of career history (an"occupation model"). CAREER was initially estimated ("pre-trained") using a large, unrepresentative resume dataset, which served as a"foundation model,"and parameter estimation was continued ("fine-tuned") using data from a representative survey. CAREER had better predictive performance than benchmarks. This paper considers an alternative where the resume-based foundation model is replaced by a large language model (LLM). We convert tabular data from the survey into text files that resemble resumes and fine-tune the LLMs using these text files with the objective to predict the next token (word). The resulting fine-tuned LLM is used as an input to an occupation model. Its predictive performance surpasses all prior models. We demonstrate the value of fine-tuning and further show that by adding more career data from a different population, fine-tuning smaller LLMs surpasses the performance of fine-tuning larger models.
Problem

Research questions and friction points this paper is trying to address.

Predicts worker's next job using career history.
Replaces resume-based model with fine-tuned LLM.
Enhances predictive performance with additional career data.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Replaces resume-based model with LLM
Converts survey data to resume-like text
Fine-tunes LLM for occupation prediction
🔎 Similar Papers
No similar papers found.