Scholar
Rowan Wang
Google Scholar ID: Y4bU0bwAAAAJ
Unknown affiliation
Mechanistic Interpretability
Language Models
Follow
Google Scholar
↗
Citations & Impact
All-time
Citations
1,399
H-index
4
i10-index
4
Publications
5
Co-authors
0
Contact
No contact links provided.
Publications
4 items
AuditBench: Evaluating Alignment Auditing Techniques on Models with Hidden Behaviors
2026
Cited
0
Believe It or Not: How Deeply do LLMs Believe Implanted Facts?
2025
Cited
0
Eliciting Secret Knowledge from Language Models
2025
Cited
0
Tamper-Resistant Safeguards for Open-Weight LLMs
arXiv.org · 2024
Cited
20
Resume (English only)
Co-authors
0 total
Co-authors: 0 (list not available)
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up