Scholar

Julian Michael

Google Scholar ID: 9DDOHR8AAAAJ

Scale AI

AI AlignmentComputational LinguisticsNatural Language ProcessingFormal Semantics

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

15,462

H-index

i10-index

Publications

Co-authors

list available

Contact

TwitterOpen ↗GitHubOpen ↗

Publications

12 items

LLM Novice Uplift on Dual-Use, In Silico Biology Tasks

2026

Cited

Best Practices for Biorisk Evaluations on Open-Weight Bio-Foundation Models

2025

Cited

Remote Labor Index: Measuring AI Automation of Remote Work

2025

Cited

Search-Time Data Contamination

2025

Cited

Inverse Scaling in Test-Time Compute

2025

Cited

Teaching Models to Verbalize Reward Hacking in Chain-of-Thought Reasoning

2025

Cited

Why Do Some Language Models Fake Alignment While Others Don't?

2025

Cited

FORTRESS: Frontier Risk Evaluation for National Security and Public Safety

2025

Cited

Resume (English only)

Academic Achievements

Published 'GPQA: A Graduate-Level Google-Proof Q&A Benchmark' (COLM 2024, Spotlight), introducing a challenging QA benchmark.
Co-authored 'Debate Helps Supervise Unreliable Experts' (arXiv preprint), exploring debate as a supervision method for unreliable experts.
Received Best Paper Award at The Big Picture Workshop (2023) for 'The Case for Scalable, Data-Driven Theory'.
Published 'Inducing Semantic Roles Without Syntax' (Findings of ACL 2021) with Luke Zettlemoyer.
Released the GPQA dataset to push the boundaries of difficult evaluations for scalable oversight.
Delivered invited talks at institutions including NYU and UW on scientific paradigms in NLP and AI ethics.

Background

Currently works on AI safety, evaluation, and alignment at Meta.
Research interests include AI alignment, formal semantics of natural language, scalable oversight, agent alignment, and using debate as a training and evaluation paradigm.
Aims to advance the science of language through data and machine learning, especially in syntax and semantics.
Advocates for empirical methods in the Science of AI and NLP to better understand intelligent behavior and language use.
Formerly led the Safety, Evaluations, and Alignment Lab (SEAL) at Scale AI.

Co-authors

83 total