Sole-authored preprint 'Teaching autoregressive language models complex tasks by demonstration' cited by papers out of Google Brain and DeepMind and discussed on Machine Learning Street Talk.
One of four winners of the AI Impacts essay competition on the Automation of Wisdom and Philosophy (out of 90 entries).
Third Prize recipient in the Inverse Scaling Prize competition, which focused on identifying tasks where larger language models exhibit decreased performance.
Co-authored 'Risk perceptions of COVID-19 around the world', referenced by U.S. News, The Telegraph, The Daily Mail, BBC Future, and 130 other outlets.
Submitted multiple papers and preprints, such as 'Confirmation bias: A challenge for scalable oversight', 'FindTheFlaws: Annotated errors for use in scalable oversight research', etc.
Participated in the study 'Large language models are more persuasive than incentivized human persuaders' as the analysis team lead.
Contributed question(s) that were selected for the dataset in 'Humanity's Last Exam' and became a co-author.
Participated in the research 'Foundational challenges in assuring alignment and safety of large language models'.
Submitted a winning task and became a co-author in 'Inverse scaling: When bigger isn't better'.
Research Experience
Director of Modulo Research, focusing on the evaluation and alignment of large language models.
Contributed to Usman et al.'s monumental agenda paper, 'Foundational Challenges in Assuring Alignment and Safety of Large Language Models'.
Participated in some of a leading frontier lab's Frontier Red Team evaluation/demo projects as part of collaborations with Hidden Variable Limited.
Led user testing research/evaluation of patient-friendly genetic reports and the widely used prognostic tool Predict: Breast Cancer at the University of Cambridge's Winton Centre for Risk and Evidence Communication.
Investigated capabilities, properties, and applications of distributional models trained on lots of text.
Conducted various studies of human semantic memory and how risk is communicated, perceived, and predicted.
Wrote an alphabet book about exoplanets (sadly uncalibrated to the reading level of any child young enough to still be interested in alphabet books).
Background
Cognitive scientist working on the evaluation and alignment of large language models as the director of Modulo Research.
Miscellany
Wrote an alphabet book about exoplanets (sadly uncalibrated to the reading level of any child young enough to still be interested in alphabet books).