AgoraResearch hub
ExploreLibraryProfile
Account
Boyi Wei
Scholar

Boyi Wei

Google Scholar ID: sRDckqEAAAAJ
PhD student, Princeton University
AI SafetyAlignment
Homepage↗Google Scholar↗
Citations & Impact
All-time
Citations
451
 
H-index
6
 
i10-index
6
 
Publications
8
 
Co-authors
41
list available
Contact
No contact links provided.
Publications
8 items
Large Language Models Generate Harmful Content Using a Distinct, Unified Mechanism
2026
Cited
0
Best Practices for Biorisk Evaluations on Open-Weight Bio-Foundation Models
2025
Cited
0
Scaling Latent Reasoning via Looped Language Models
2025
Cited
0
Holistic Agent Leaderboard: The Missing Infrastructure for AI Agent Evaluation
2025
Cited
0
Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents
2025
Cited
0
Dynamic Risk Assessments for Offensive Cybersecurity Agents
2025
Cited
0
An Adversarial Perspective on Machine Unlearning for AI Safety
2024
Cited
14
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors
arXiv.org · 2024
Cited
36
Resume (English only)
Co-authors
41 total
Yangsibo Huang
Yangsibo Huang
Google DeepMind
Peter Henderson
Peter Henderson
Princeton University
Xiangyu QI
Xiangyu QI
OpenAI
Co-author 4
Co-author 4
Prateek Mittal
Prateek Mittal
Professor, Princeton University
Kaixuan Huang
Kaixuan Huang
Princeton University
Luxi He
Luxi He
Department of Computer Science, Princeton University
Udari Madhushani Sehwag
Udari Madhushani Sehwag
Research Scientist, Scale AI

Welcome back

Sign in to Agora

Welcome back! Please sign in to continue.

Do not have an account?