Modernizing Facebook Scoped Search: Keyword and Embedding Hybrid Retrieval with LLM Evaluation

📅 2025-09-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address low relevance, insufficient result diversity, and high manual evaluation costs in social network group search, this paper proposes a hybrid search framework integrating keyword retrieval with embedding-based retrieval (EBR), synergistically combining inverted index and vector search. We innovatively design an automated, offline evaluation system powered by large language models (LLMs) to efficiently and consistently quantify both relevance and diversity of search results. Deployed in Facebook’s production group search system, the framework demonstrates significant improvements on large-scale real-world traffic: +12.3% click-through rate, +18.7% coverage of long-tail queries, and +24.1% LLM-based holistic quality score. Our approach establishes a scalable, rigorously evaluable paradigm for semantic enhancement in social search.

Technology Category

Application Category

📝 Abstract
Beyond general web-scale search, social network search uniquely enables users to retrieve information and discover potential connections within their social context. We introduce a framework of modernized Facebook Group Scoped Search by blending traditional keyword-based retrieval with embedding-based retrieval (EBR) to improve the search relevance and diversity of search results. Our system integrates semantic retrieval into the existing keyword search pipeline, enabling users to discover more contextually relevant group posts. To rigorously assess the impact of this blended approach, we introduce a novel evaluation framework that leverages large language models (LLMs) to perform offline relevance assessments, providing scalable and consistent quality benchmarks. Our results demonstrate that the blended retrieval system significantly enhances user engagement and search quality, as validated by both online metrics and LLM-based evaluation. This work offers practical insights for deploying and evaluating advanced retrieval systems in large-scale, real-world social platforms.
Problem

Research questions and friction points this paper is trying to address.

Hybrid keyword and embedding retrieval for social search
LLM-based evaluation framework for relevance assessment
Improving search relevance and diversity in social networks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid keyword and embedding retrieval
LLM-based offline relevance evaluation
Semantic integration into existing pipelines
🔎 Similar Papers
No similar papers found.
Y
Yongye Su
Meta, Menlo Park, CA
Z
Zeya Zhang
Meta, Bellevue, WA
J
Jane Kou
Meta, Bellevue, WA
Cheng Ju
Cheng Ju
University of California, Berkeley
Machine LearningCausal Inference
S
Shubhojeet Sarkar
Meta, Menlo Park, CA
Yamin Wang
Yamin Wang
General Electric
power systemoptimizationforecasting
J
Ji Liu
Meta, Bellevue, WA
Shengbo Guo
Shengbo Guo
Meta, Menlo Park, CA