Scholar
Yutao Mou
Google Scholar ID: f71f5YkAAAAJ
Peking University
AI Safety
LLM Alignment
Follow
Homepage
↗
Google Scholar
↗
Citations & Impact
All-time
Citations
270
H-index
9
i10-index
9
Publications
20
Co-authors
11
list available
Contact
Email
yutao.mou@stu.pku.edu.cn
GitHub
Open ↗
Publications
5 items
ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback
2026
Cited
0
Decoupling Safety into Orthogonal Subspace: Cost-Efficient and Performance-Preserving Alignment for Large Language Models
2025
Cited
0
AutoRed: A Free-form Adversarial Prompt Generation Framework for Automated Red Teaming
2025
Cited
0
Can You Really Trust Code Copilots? Evaluating Large Language Models from a Code Security Perspective
2025
Cited
0
SaRO: Enhancing LLM Safety through Reasoning-based Alignment
2025
Cited
0
Resume (English only)
Academic Achievements
- Publications:
1. SaRO: Enhancing LLM Safety through Reasoning-based Alignment
2. Can You Really Trust Code Copilots? Evaluating Large Language Models from a Code Security Perspective
3. SG-Bench: Evaluating LLM Safety Generalization Across Diverse Tasks and Prompt Types
4. UEGP: Unified Expert-Guided Pre-training for Knowledge Rekindle
5. Decoupling Pseudo Label Disambiguation and Representation Learning for Generalized Intent Discovery
Research Experience
- KCL Lab, Peking University, PhD student, supervised by Prof. Wei Ye and Prof. Shikun Zhang
Education
- Beijing University of Posts and Telecommunications, Master's, 2024
- Beijing University of Posts and Telecommunications, Bachelor's, 2021
- Supervisors: Prof. Wei Ye and Prof. Shikun Zhang
Background
- Research Interests: Building safe, reliable, and scalable artificial intelligence systems
- Main Research Areas:
1. Safety Evaluation and Red Teaming of Large Language Models (LLMs)
2. Post-Training and Safety Alignment of LLMs
Miscellany
Contact: Email / Scholar / Github
Co-authors
11 total
Co-author 1
Jingang Wang
Meituan
Xiaoshuai Song
Beijing University of Posts and Telecommunications
Wei Wu (武威)
Researcher at Ant Group
Wei Ye
Peking University
Shikun Zhang
北京大学
Yanan Wu
Alibaba Group
Xubo Liu
Meta Superintelligence Labs
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up