An Efficient Long-Context Ranking Architecture With Calibrated LLM Distillation: Application to Person-Job Fit

📅 2026-01-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a re-ranking model based on a delayed cross-attention architecture to address inefficiency and historical bias in matching long, multilingual, structured resumes with job descriptions. The approach efficiently models long-range contextual dependencies by decomposing resumes and job briefs, and leverages large language models to generate fine-grained semantic supervision signals for knowledge distillation. An enhanced distillation loss function is introduced to improve matching consistency and interpretability. Experimental results demonstrate that the proposed model significantly outperforms state-of-the-art methods in relevance, ranking quality, and calibration performance, thereby enhancing the accuracy and reliability of person-job matching.

Technology Category

Application Category

📝 Abstract
Finding the most relevant person for a job proposal in real time is challenging, especially when resumes are long, structured, and multilingual. In this paper, we propose a re-ranking model based on a new generation of late cross-attention architecture, that decomposes both resumes and project briefs to efficiently handle long-context inputs with minimal computational overhead. To mitigate historical data biases, we use a generative large language model (LLM) as a teacher, generating fine-grained, semantically grounded supervision. This signal is distilled into our student model via an enriched distillation loss function. The resulting model produces skill-fit scores that enable consistent and interpretable person-job matching. Experiments on relevance, ranking, and calibration metrics demonstrate that our approach outperforms state-of-the-art baselines.
Problem

Research questions and friction points this paper is trying to address.

person-job fit
long-context ranking
resume matching
multilingual resumes
real-time relevance
Innovation

Methods, ideas, or system contributions that make the work stand out.

late cross-attention
long-context ranking
LLM distillation
person-job fit
calibrated ranking
🔎 Similar Papers
No similar papers found.
W
Warren Jouanneau
Malt, 33000 Bordeaux, France
E
Emma Jouffroy
Malt, 33000 Bordeaux, France
Marc Palyart
Marc Palyart
Director of Machine Learning, Malt, France
Machine LearningSoftware EngineeringHigh-Performance Computing