RedOne: Revealing Domain-specific LLM Post-Training in Social Networking Services

📅 2025-07-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing large language models (LLMs) exhibit limited generalization and domain adaptability for social networking services (SNS), primarily due to their reliance on single-task paradigms, hindering effective content moderation and interaction quality enhancement. Method: We propose RedOne—the first LLM specifically designed for SNS—employing a three-stage training strategy: continual pretraining, supervised fine-tuning, and preference optimization, all grounded in large-scale real-world SNS data to achieve robust domain adaptation. RedOne features a unified foundational architecture supporting eight core SNS tasks, balancing general linguistic competence with domain-specific expertise. Results: Experiments demonstrate an average 14.02% improvement across eight SNS tasks and a 7.56% gain on a bilingual SNS evaluation benchmark. Online deployment shows a 11.23% reduction in harmful content exposure and a 14.95% increase in search-page click-through rate, validating the efficacy and practicality of domain-specialized modeling.

Technology Category

Application Category

📝 Abstract
As a primary medium for modern information dissemination, social networking services (SNS) have experienced rapid growth, which has proposed significant challenges for platform content management and interaction quality improvement. Recently, the development of large language models (LLMs) has offered potential solutions but existing studies focus on isolated tasks, which not only encounter diminishing benefit from the data scaling within individual scenarios but also fail to flexibly adapt to diverse real-world context. To address these challenges, we introduce RedOne, a domain-specific LLM designed to break the performance bottleneck of single-task baselines and establish a comprehensive foundation for the SNS. RedOne was developed through a three-stage training strategy consisting of continue pretraining, supervised fine-tuning, and preference optimization, using a large-scale real-world dataset. Through extensive experiments, RedOne maintains strong general capabilities, and achieves an average improvement up to 14.02% across 8 major SNS tasks and 7.56% in SNS bilingual evaluation benchmark, compared with base models. Furthermore, through online testing, RedOne reduced the exposure rate in harmful content detection by 11.23% and improved the click page rate in post-view search by 14.95% compared with single-tasks finetuned baseline models. These results establish RedOne as a robust domain-specific LLM for SNS, demonstrating excellent generalization across various tasks and promising applicability in real-world scenarios.
Problem

Research questions and friction points this paper is trying to address.

Enhancing content management in social networking services
Overcoming limitations of single-task LLM approaches
Improving multilingual performance in SNS applications
Innovation

Methods, ideas, or system contributions that make the work stand out.

Three-stage training strategy for LLM
Domain-specific LLM for SNS tasks
Large-scale real-world dataset utilization
🔎 Similar Papers
No similar papers found.
F
Fei Zhao
NLP Team, Xiaohongshu Inc., China
C
Chonggang Lu
NLP Team, Xiaohongshu Inc., China
Y
Yue Wang
NLP Team, Xiaohongshu Inc., China
Zheyong Xie
Zheyong Xie
Xiaohongshu Inc., University of Science and Technology of China
MultimodalLarge Language ModelAgent
Z
Ziyan Liu
NLP Team, Xiaohongshu Inc., China
H
Haofu Qian
NLP Team, Xiaohongshu Inc., China
J
JianZhao Huang
NLP Team, Xiaohongshu Inc., China
F
Fangcheng Shi
NLP Team, Xiaohongshu Inc., China
Z
Zijie Meng
NLP Team, Xiaohongshu Inc., China
Hongcheng Guo
Hongcheng Guo
School of Data Science, Fudan University
LLMsMultimodal LLMs
M
Mingqian He
NLP Team, Xiaohongshu Inc., China
X
Xinze Lyu
NLP Team, Xiaohongshu Inc., China
Y
Yiming Lu
NLP Team, Xiaohongshu Inc., China
Z
Ziyang Xiang
NLP Team, Xiaohongshu Inc., China
Zheyu Ye
Zheyu Ye
Imperial College London
Language ModelsAI Agents
Chengqiang Lu
Chengqiang Lu
USTC
Z
Zhe Xu
NLP Team, Xiaohongshu Inc., China
Y
Yi Wu
NLP Team, Xiaohongshu Inc., China
Yao Hu
Yao Hu
浙江大学
Machine Learning
Y
Yan Gao
NLP Team, Xiaohongshu Inc., China
J
Jun Fan
NLP Team, Xiaohongshu Inc., China
X
Xiaolong Jiang
NLP Team, Xiaohongshu Inc., China
W
Weiting Liu
NLP Team, Xiaohongshu Inc., China
B
Boyang Wang
NLP Team, Xiaohongshu Inc., China
Shaosheng Cao
Shaosheng Cao
Xiaohongshu, DiDi Chuxing, Ant Financial, Microsoft Research
LLMsMultimodal LLMsReinforcement LearningNLPGraph Neural Networks