Published 'GPQA: A Graduate-Level Google-Proof Q&A Benchmark' (COLM 2024, Spotlight), introducing a challenging QA benchmark.
Co-authored 'Debate Helps Supervise Unreliable Experts' (arXiv preprint), exploring debate as a supervision method for unreliable experts.
Received Best Paper Award at The Big Picture Workshop (2023) for 'The Case for Scalable, Data-Driven Theory'.
Published 'Inducing Semantic Roles Without Syntax' (Findings of ACL 2021) with Luke Zettlemoyer.
Released the GPQA dataset to push the boundaries of difficult evaluations for scalable oversight.
Delivered invited talks at institutions including NYU and UW on scientific paradigms in NLP and AI ethics.
Background
Currently works on AI safety, evaluation, and alignment at Meta.
Research interests include AI alignment, formal semantics of natural language, scalable oversight, agent alignment, and using debate as a training and evaluation paradigm.
Aims to advance the science of language through data and machine learning, especially in syntax and semantics.
Advocates for empirical methods in the Science of AI and NLP to better understand intelligent behavior and language use.
Formerly led the Safety, Evaluations, and Alignment Lab (SEAL) at Scale AI.