Hannah Rose Kirk
Scholar

Hannah Rose Kirk

Google Scholar ID: Fha8ldEAAAAJ
University of Oxford
Large language modelsNLPEthics in AIAlignmentAI Safety
Citations & Impact
All-time
Citations
2,778
 
H-index
24
 
i10-index
28
 
Publications
20
 
Co-authors
25
list available
Resume (English only)
Academic Achievements
  • Oct 2024: 'LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages' accepted as an oral presentation at NeurIPS 2024 (top 0.5% of submissions).
  • Oct 2024: Contributed to 'The PRISM Alignment Project', exploring what participatory, representative, and individualized human feedback reveals about subjective and multicultural alignment of LLMs.
  • 2023–2024: Awarded Microsoft’s Accelerating Foundation Models Research Programme grant for 'Personalised and diverse feedback for humans-and-models-in-the-loop'.
  • 2022–2024: Awarded Meta AI Dynabench Grant for 'Optimizing feedback between humans-and-model-in-the-loop'.
  • 2020–2024: ESRC PhD Scholarship (Digital Social Science Pathway).
Research Experience
  • Sep 2024–Present: Research Scientist (Societal Impacts), UK AI Safety Institute, His Majesty's Government. Investigating social and psychological capabilities of frontier AI.
  • Sep–Dec 2023: Visiting Academic in Data Science, New York University. Collaborated with Prof. He and Prof. Bowman on human-AI coordination and LLM alignment.
  • Feb–Oct 2023: External Student Researcher, Google. Co-hosted an adversarial challenge to identify unsafe failure modes in text-to-image models.
  • Aug–Dec 2023: Red-Teamer + Consultant, OpenAI. Improved safety of DALL-E and GPT-4 models.
  • Sep 2021–Sep 2023: Data Scientist in Online Safety, The Alan Turing Institute. Monitored and detected harmful language.
  • Sep 2021–Jul 2023: Research Scientist, Rewire Online. Implemented NLP solutions for online safety.
  • Oct 2020–Oct 2023: Research Labs Manager, Oxford Artificial Intelligence Society. Led student research projects on AI bias.
  • Sep 2019–Sep 2020: Research Scholar, The Berggruen Institute, China Center. Explored links between Chinese philosophy, AI, and privacy.
Background
  • Currently pursuing a PhD at the University of Oxford and working as a Research Scientist at the UK AI Safety Institute.
  • Research focuses on human-and-model-in-the-loop feedback and data-centric AI alignment.
  • Passionate about the societal impacts of AI systems as they scale across capabilities, domains, and populations.
  • Published work spans computational linguistics, economics, ethics, and sociology, addressing alignment, bias, fairness, and hate speech from a multidisciplinary perspective.
  • Frequently collaborates with industry and policymakers.