Scholar
Ethan Perez
Google Scholar ID: za0-taQAAAAJ
Anthropic
AI Safety
Follow
Homepage
↗
Google Scholar
↗
Citations & Impact
All-time
Citations
26,523
H-index
41
i10-index
57
Publications
20
Co-authors
13
list available
Contact
No contact links provided.
Publications
14 items
The Hot Mess of AI: How Does Misalignment Scale With Model Intelligence and Task Complexity?
2026
Cited
0
Constitutional Classifiers++: Efficient Production-Grade Defenses against Universal Jailbreaks
arXiv.org · 2026
Cited
2
Natural Emergent Misalignment from Reward Hacking in Production RL
2025
Cited
0
Agentic Misalignment: How LLMs Could Be Insider Threats
2025
Cited
0
Towards Safeguarding LLM Fine-tuning APIs against Cipher Attacks
2025
Cited
0
Inverse Scaling in Test-Time Compute
2025
Cited
0
Unsupervised Elicitation of Language Models
2025
Cited
0
Reasoning Models Don't Always Say What They Think
2025
Cited
7
Load more
Resume (English only)
Co-authors
13 total
Douwe Kiela
Contextual AI, Stanford University
Patrick Lewis
Director of Agentic AI, Cohere
Samuel R. Bowman
Anthropic and NYU
Sebastian Riedel
Honorary Professor @ University College London, Researcher @ DeepMind
Kyunghyun Cho
New York University, Genentech
Aaron Courville
Professor, DIRO, Université de Montréal, Mila, Cifar CAI chair
Jared Kaplan
Johns Hopkins University & Anthropic
Evan Hubinger
Member of Technical Staff, Anthropic
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up