John Hughes
Scholar

John Hughes

Google Scholar ID: XswwnxUAAAAJ
Anthropic
scalable oversightadversarial robustnessautomatic speech recognition
Citations & Impact
All-time
Citations
560
 
H-index
7
 
i10-index
7
 
Publications
10
 
Co-authors
0
 
Resume (English only)
Academic Achievements
  • 1. Best Paper Award: A paper on debate was awarded best paper at ICML 2024.
  • 2. Publication: 'Best-of-N Jailbreaking' aimed for ICML 2025.
  • 3. Publication: 'Jailbreak Defense in a Narrow Domain: Limitations of Existing Methods and a New Transcript-Classifier Approach' presented at NeurIPS 2024 AdvML Frontiers (Oral) & SoLaR.
  • 4. Publication: 'Debating with More Persuasive LLMs Leads to More Truthful Answers' presented at ICML 2024 Oral & Best Paper.
  • 5. Supported Work: 'Looking Inward: Language Models Can Learn About Themselves by Introspection' accepted for ICLR 2025.
Research Experience
  • Works as an Independent Alignment Researcher at Anthropic; also advises Speechmatics, helping them advance speech recognition products, including the low-latency voice assistant Flow. Before starting AI safety research, worked as a machine learning engineer and manager with Speechmatics, contributing to their latest speech-to-text system Ursa.
Background
  • Independent Alignment Researcher, focusing on scalable oversight and adversarial robustness. Enjoys fast empirical research with LLMs. Has been supervised by Ethan Perez since Summer 2023 as part of the MATS Program. Provides technical support and coaching for Ethan's MATS 7 cohort.
Miscellany
  • Hobbies include FPV drones and personal music production.
Co-authors
0 total
Co-authors: 0 (list not available)