Scholar

Long Phan

Google Scholar ID: fVRQn4wAAAAJ

Center for AI Safety

LLMAI Safety

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

4,743

H-index

i10-index

Publications

Co-authors

list available

Contact

Emailmy_first_name@safe.ai GitHubOpen ↗LinkedInOpen ↗

Publications

9 items

Reducing Political Manipulation with Consistency Training

2026

Cited

A Definition of AGI

2025

Cited

TextQuests: How Good are LLMs at Text-Based Video Games?

2025

Cited

Virology Capabilities Test (VCT): A Multimodal Virology Q&A Benchmark

2025

Cited

Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs

2025

Cited

Humanity's Last Exam

2025

Cited

Tamper-Resistant Safeguards for Open-Weight LLMs

arXiv.org · 2024

Cited

Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?

arXiv.org · 2024

Cited

Resume (English only)

Academic Achievements

Published several papers such as 'Humanity's Last Exam', 'Improving Alignment and Robustness with Circuit Breakers', 'HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal', etc. Also contributed to large open-source projects like BLOOM.

Research Experience

Working at Center for AI Safety, involved in multiple research projects including Humanity's Last Exam, Improving Alignment and Robustness with Circuit Breakers, HarmBench, etc.

Education

Received a B.S. in Computer Science from Case Western Reserve University in 2023. Worked with Trieu H. Trinh and Minh-Thang Luong (DeepMind) during undergraduate studies.

Background

Currently a Research Engineer at Center for AI Safety, working with Dan Hendrycks. Interested in AI Safety.

Miscellany

Ranked 1st in North America for Amumu (3rd globally) and 3rd in North America for Gragas in League of Legends, Season 2023.

Co-authors

4 total

Dan Hendrycks

Director of the Center for AI Safety (advisor for xAI and Scale)

Senior Staff Research Scientist at Google