Scholar
Jacob Hilton
Google Scholar ID: WyKvz7EAAAAJ
Alignment Research Center
AI alignment
reinforcement learning
set theory
Follow
Homepage
↗
Google Scholar
↗
Citations & Impact
All-time
Citations
31,200
H-index
12
i10-index
15
Publications
20
Co-authors
0
Contact
No contact links provided.
Publications
4 items
Estimating the expected output of wide random MLPs more efficiently than sampling
2026
Cited
0
Subliminal Learning: Language models transmit behavioral traits via hidden signals in data
2025
Cited
0
Obfuscated Activations Bypass LLM Latent-Space Defenses
arXiv.org · 2024
Cited
0
Estimating the Probabilities of Rare Outputs in Language Models
arXiv.org · 2024
Cited
0
Resume (English only)
Co-authors
0 total
Co-authors: 0 (list not available)
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up