Scholar
Boyi Wei
Google Scholar ID: sRDckqEAAAAJ
PhD student, Princeton University
AI Safety
Alignment
Follow
Homepage
↗
Google Scholar
↗
Citations & Impact
All-time
Citations
451
H-index
6
i10-index
6
Publications
8
Co-authors
41
list available
Contact
No contact links provided.
Publications
8 items
Large Language Models Generate Harmful Content Using a Distinct, Unified Mechanism
2026
Cited
0
Best Practices for Biorisk Evaluations on Open-Weight Bio-Foundation Models
2025
Cited
0
Scaling Latent Reasoning via Looped Language Models
2025
Cited
0
Holistic Agent Leaderboard: The Missing Infrastructure for AI Agent Evaluation
2025
Cited
0
Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents
2025
Cited
0
Dynamic Risk Assessments for Offensive Cybersecurity Agents
2025
Cited
0
An Adversarial Perspective on Machine Unlearning for AI Safety
2024
Cited
14
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors
arXiv.org · 2024
Cited
36
Resume (English only)
Co-authors
41 total
Yangsibo Huang
Google DeepMind
Peter Henderson
Princeton University
Xiangyu QI
OpenAI
Co-author 4
Prateek Mittal
Professor, Princeton University
Kaixuan Huang
Princeton University
Luxi He
Department of Computer Science, Princeton University
Udari Madhushani Sehwag
Research Scientist, Scale AI
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up