Semantic Search At LinkedIn

πŸ“… 2026-02-07
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work proposes a semantic retrieval framework for LinkedIn’s job and talent search, designed to replace traditional keyword matching with large language model (LLM)-driven semantic search under strict latency constraints. The framework integrates LLM-based relevance scoring, embedding-based retrieval, and a compact student model trained via multi-teacher distillation. It further introduces a prefill-guided inference architecture augmented with model pruning, context compression, and a hybrid text-embedding interaction mechanism. The resulting system achieves over 75Γ— throughput improvement while preserving near-teacher-model NDCG performance, establishing one of the first industry-scale LLM-powered semantic ranking systems that simultaneously delivers high efficiency, scalability, and significantly enhanced search quality and user engagement.

Technology Category

Application Category

πŸ“ Abstract
Semantic search with large language models (LLMs) enables retrieval by meaning rather than keyword overlap, but scaling it requires major inference efficiency advances. We present LinkedIn's LLM-based semantic search framework for AI Job Search and AI People Search, combining an LLM relevance judge, embedding-based retrieval, and a compact Small Language Model trained via multi-teacher distillation to jointly optimize relevance and engagement. A prefill-oriented inference architecture co-designed with model pruning, context compression, and text-embedding hybrid interactions boosts ranking throughput by over 75x under a fixed latency constraint while preserving near-teacher-level NDCG, enabling one of the first production LLM-based ranking systems with efficiency comparable to traditional approaches and delivering significant gains in quality and user engagement.
Problem

Research questions and friction points this paper is trying to address.

semantic search
large language models
inference efficiency
relevance ranking
scalability
Innovation

Methods, ideas, or system contributions that make the work stand out.

semantic search
large language models
model distillation
inference optimization
embedding retrieval
πŸ”Ž Similar Papers
No similar papers found.
Fedor Borisyuk
Fedor Borisyuk
LinkedIn
Machine learning
S
Sriram Vasudevan
LinkedIn, Mountain View, CA, USA
M
Muchen Wu
LinkedIn, Mountain View, CA, USA
G
Guoyao Li
LinkedIn, Mountain View, CA, USA
B
Benjamin Le
LinkedIn, Mountain View, CA, USA
S
Shaobo Zhang
LinkedIn, Mountain View, CA, USA
Q
Qianqi Kay Shen
LinkedIn, Mountain View, CA, USA
Yuchin Juan
Yuchin Juan
Linkedin
Machine Learning
Kayhan Behdin
Kayhan Behdin
LinkedIn
Operations ResearchOptimizationApplied Statistics
Liming Dong
Liming Dong
CSIRO Data61
Software EngineeringSoftware TraceabilityData QualityDevOpsAgentOps
K
Kaixu Yang
LinkedIn, Mountain View, CA, USA
S
Shusen Jing
LinkedIn, Mountain View, CA, USA
R
Ravi Pothamsetty
LinkedIn, Mountain View, CA, USA
Rajat Arora
Rajat Arora
LinkedIn
Recommendation SystemsGenerative AILLMsNLPComputer Vision
S
Sophie Yanying Sheng
LinkedIn, Mountain View, CA, USA
V
Vitaly Abdrashitov
LinkedIn, Mountain View, CA, USA
Y
Yang Zhao
LinkedIn, Mountain View, CA, USA
L
Lin Su
LinkedIn, Mountain View, CA, USA
X
Xiaoqing Wang
LinkedIn, Mountain View, CA, USA
Chujie Zheng
Chujie Zheng
Qwen Team, Alibaba Group
Artifical IntelligenceLarge Language Models
S
Sarang Metkar
LinkedIn, Mountain View, CA, USA
R
Rupesh Gupta
LinkedIn, Mountain View, CA, USA
I
Igor Lapchuk
LinkedIn, Mountain View, CA, USA
D
David N. Racca
LinkedIn, Mountain View, CA, USA
M
Madhumitha Mohan
LinkedIn, Mountain View, CA, USA