Scholar

Jacob Hilton

Google Scholar ID: WyKvz7EAAAAJ

Alignment Research Center

AI alignmentreinforcement learningset theory

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

31,200

H-index

12

i10-index

15

Publications

20

Co-authors

0

Contact

No contact links provided.

Publications

4 items

Estimating the expected output of wide random MLPs more efficiently than sampling

2026

Cited

0

Subliminal Learning: Language models transmit behavioral traits via hidden signals in data

2025

Cited

0

Obfuscated Activations Bypass LLM Latent-Space Defenses

arXiv.org · 2024

Cited

0

Estimating the Probabilities of Rare Outputs in Language Models

arXiv.org · 2024

Cited

0

Resume (English only)

Co-authors

0 total

Co-authors: 0 (list not available)