1. Best Paper Award: A paper on debate was awarded best paper at ICML 2024.
2. Publication: 'Best-of-N Jailbreaking' aimed for ICML 2025.
3. Publication: 'Jailbreak Defense in a Narrow Domain: Limitations of Existing Methods and a New Transcript-Classifier Approach' presented at NeurIPS 2024 AdvML Frontiers (Oral) & SoLaR.
4. Publication: 'Debating with More Persuasive LLMs Leads to More Truthful Answers' presented at ICML 2024 Oral & Best Paper.
5. Supported Work: 'Looking Inward: Language Models Can Learn About Themselves by Introspection' accepted for ICLR 2025.
Research Experience
Works as an Independent Alignment Researcher at Anthropic; also advises Speechmatics, helping them advance speech recognition products, including the low-latency voice assistant Flow. Before starting AI safety research, worked as a machine learning engineer and manager with Speechmatics, contributing to their latest speech-to-text system Ursa.
Background
Independent Alignment Researcher, focusing on scalable oversight and adversarial robustness. Enjoys fast empirical research with LLMs. Has been supervised by Ethan Perez since Summer 2023 as part of the MATS Program. Provides technical support and coaching for Ethan's MATS 7 cohort.
Miscellany
Hobbies include FPV drones and personal music production.