AgoraResearch hub
ExploreLibraryProfile
Account
Ethan Perez
Scholar

Ethan Perez

Google Scholar ID: za0-taQAAAAJ
Anthropic
AI Safety
Homepage↗Google Scholar↗
Citations & Impact
All-time
Citations
26,523
 
H-index
41
 
i10-index
57
 
Publications
20
 
Co-authors
13
list available
Contact
No contact links provided.
Publications
14 items
The Hot Mess of AI: How Does Misalignment Scale With Model Intelligence and Task Complexity?
2026
Cited
0
Constitutional Classifiers++: Efficient Production-Grade Defenses against Universal Jailbreaks
arXiv.org · 2026
Cited
2
Natural Emergent Misalignment from Reward Hacking in Production RL
2025
Cited
0
Agentic Misalignment: How LLMs Could Be Insider Threats
2025
Cited
0
Towards Safeguarding LLM Fine-tuning APIs against Cipher Attacks
2025
Cited
0
Inverse Scaling in Test-Time Compute
2025
Cited
0
Unsupervised Elicitation of Language Models
2025
Cited
0
Reasoning Models Don't Always Say What They Think
2025
Cited
7
Resume (English only)
Co-authors
13 total
Douwe Kiela
Douwe Kiela
Contextual AI, Stanford University
Patrick Lewis
Patrick Lewis
Director of Agentic AI, Cohere
Samuel R. Bowman
Samuel R. Bowman
Anthropic and NYU
Sebastian Riedel
Sebastian Riedel
Honorary Professor @ University College London, Researcher @ DeepMind
Kyunghyun Cho
Kyunghyun Cho
New York University, Genentech
Aaron Courville
Aaron Courville
Professor, DIRO, Université de Montréal, Mila, Cifar CAI chair
Jared Kaplan
Jared Kaplan
Johns Hopkins University & Anthropic
Evan Hubinger
Evan Hubinger
Member of Technical Staff, Anthropic

Welcome back

Sign in to Agora

Welcome back! Please sign in to continue.

Do not have an account?