LLM-Powered Nuanced Video Attribute Annotation for Enhanced Recommendations

📅 2025-10-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional machine learning struggles with fine-grained semantic understanding of short videos—e.g., mood and emotion—due to lengthy development cycles and limited representational capacity. To address this, we propose a novel “LLM-as-annotator” paradigm, leveraging large language models as scalable, high-accuracy automatic annotation engines. Our approach fuses multimodal features, incorporates reasoning optimization and knowledge distillation, and enables high-quality offline batch annotation of fine-grained video attributes. Through iterative definition–evaluation cycles, it surpasses human annotation quality in offline evaluation. Online A/B testing demonstrates significant improvements in user engagement and satisfactory consumption behaviors when integrated into a personalized retrieval system. This work presents the first systematic validation of LLMs’ annotation efficacy and deployment feasibility for industrial-scale short-video understanding, offering both methodological innovation and engineering scalability.

Technology Category

Application Category

📝 Abstract
This paper presents a case study on deploying Large Language Models (LLMs) as an advanced "annotation" mechanism to achieve nuanced content understanding (e.g., discerning content "vibe") at scale within a large-scale industrial short-form video recommendation system. Traditional machine learning classifiers for content understanding face protracted development cycles and a lack of deep, nuanced comprehension. The "LLM-as-annotators" approach addresses these by significantly shortening development times and enabling the annotation of subtle attributes. This work details an end-to-end workflow encompassing: (1) iterative definition and robust evaluation of target attributes, refined by offline metrics and online A/B testing; (2) scalable offline bulk annotation of video corpora using LLMs with multimodal features, optimized inference, and knowledge distillation for broad application; and (3) integration of these rich annotations into the online recommendation serving system, for example, through personalized restrict retrieval. Experimental results demonstrate the efficacy of this approach, with LLMs outperforming human raters in offline annotation quality for nuanced attributes and yielding significant improvements of user participation and satisfied consumption in online A/B tests. The study provides insights into designing and scaling production-level LLM pipelines for rich content evaluation, highlighting the adaptability and benefits of LLM-generated nuanced understanding for enhancing content discovery, user satisfaction, and the overall effectiveness of modern recommendation systems.
Problem

Research questions and friction points this paper is trying to address.

Using LLMs to annotate nuanced video attributes for recommendations
Addressing slow development cycles of traditional content classifiers
Integrating rich annotations into online recommendation serving systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs annotate nuanced video attributes for recommendations
End-to-end workflow integrates offline annotation and online serving
Multimodal features and knowledge distillation enable scalable application
🔎 Similar Papers
No similar papers found.
B
Boyuan Long
Google, Mountain View, CA, USA
Y
Yueqi Wang
Google, New York, NY, USA
H
Hiloni Mehta
Google, Mountain View, CA, USA
M
Mick Zomnir
Google, Mountain View, CA, USA
Omkar Pathak
Omkar Pathak
Google, Mountain View, CA, USA
C
Changping Meng
Google, Mountain View, CA, USA
R
Ruolin Jia
Google, Mountain View, CA, USA
Y
Yajun Peng
Google, Mountain View, CA, USA
D
Dapeng Hong
Google, Mountain View, CA, USA
Xia Wu
Xia Wu
Central University of Finance and Economics
Entanglement TheoryQuantum Information TheoryFoundations of Quantum Theory
M
Mingyan Gao
Google, Mountain View, CA, USA
Onkar Dalal
Onkar Dalal
Stanford University
graphical modelsoptimization algorithmsmachine learningdata mining
N
Ningren Han
Google, Mountain View, CA, USA