Scholar

Erik Jones

Google Scholar ID: _-CU2CsAAAAJ

UC Berkeley

Machine Learning

Google Scholar↗

Citations & Impact

All-time

Citations

2,082

H-index

14

i10-index

14

Publications

18

Co-authors

16

list available

Contact

No contact links provided.

Publications

8 items

AI Organizations are More Effective but Less Aligned than Individual Agents

2026

Cited

0

Abstractive Red-Teaming of Language Model Character

2026

Cited

1

Eliciting Harmful Capabilities by Fine-Tuning On Safeguarded Outputs

2026

Cited

3

Beyond Data Filtering: Knowledge Localization for Capability Removal in LLMs

2025

Cited

0

Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples

2025

Cited

0

Uncovering Gaps in How Humans and LLMs Interpret Subjective Language

2025

Cited

0

Forecasting Rare Language Model Behaviors

2025

Cited

0

How Do Large Language Monkeys Get Their Power (Laws)?

2025

Cited

0

Resume (English only)

Co-authors

16 total

Jacob Steinhardt

Stanford University

Stanford University

Pranav Rajpurkar, PhD

Associate Professor of Biomedical Informatics, Harvard Medical School

Google and University of Washington

Aditi Raghunathan

Assistant professor, Carnegie Mellon University

Assistant Professor at UC Berkeley // Director, AI Safety and Alignment, Google DeepMind

Associate Professor of Computer Science, Stanford University

Microsoft Research